-
Notifications
You must be signed in to change notification settings - Fork 336
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
GSoC proposal: Enabling efficient PODIO data model integration with O…
…NNX for training and inference
- Loading branch information
Showing
3 changed files
with
74 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
--- | ||
title: "DESY" | ||
author: "Frank Gaede" | ||
layout: default | ||
organization: DESY | ||
logo: DESY_logo.png | ||
description: | | ||
The Deutsches Elektronen-Synchrotron (DESY) is a major German physics | ||
laboratory with a long interest in high-energy physics. DESY is a | ||
major centre for photon science and the site of the European XFEL | ||
laser. DESY scientists are part of major international HEP experiments, | ||
such as ATLAS, CMS and Belle II. | ||
--- | ||
|
||
{% include gsoc_proposal.ext %} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
--- | ||
title: "University of Manitoba" | ||
author: "Wouter Deconinck" | ||
layout: default | ||
organization: umanitoba | ||
logo: UManitoba-logo.png | ||
description: | | ||
The University of Manitoba is a Canadian public research university in the province of Manitoba, | ||
located on original lands of Anishinaabeg, Cree, Oji-Cree, Dakota, and Dene peoples, and on the | ||
homeland of the Métis Nation. | ||
--- | ||
|
||
{% include gsoc_proposal.ext %} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,46 @@ | ||
--- | ||
title: Enabling efficient PODIO data model integration with ONNX for training and inference | ||
layout: gsoc_proposal | ||
project: Key4hep | ||
year: 2024 | ||
organization: | ||
- CERN | ||
- DESY | ||
- UManitoba | ||
difficulty: medium | ||
duration: 350 | ||
mentor_avail: May-October | ||
--- | ||
|
||
## Description | ||
|
||
[PODIO](https://github.com/AIDASoft/podio) is a data model definition package that provides the necessary functionality to generate C++ and Python code from a high level definition of an event data model in yaml format. EDM4hep is an example of a data model generated with PODIO, which serves as the common event data model for many HEP communities, brought together under the umbrella of the [Key4hep](https://github.com/key4hep) project. | ||
|
||
With the increasing importance of artificial intelligence and machine learning workflows, there is a need to integate PODIO data models into these workflows. In particular, this may require specification or conversions of array-of-struct or struct-of-array data storage models in PODIO data models or on collections based on these data models. | ||
|
||
## Task ideas | ||
|
||
* Develop a (naive, possibly inefficient) PODIO-input, ONNX-output training example for a simple binary classification problem, in a PODIO unit test. | ||
* Develop a (naive, possibly inefficient) PODIO-input, ONNX-input, PODIO-output inference example for the previous simple binary classification problem. | ||
* Develop an approach to storing sufficient information about the PODIO-input in ONNX metadata to be able to check whether the ONNX model is compatible with the specified PODIO input. | ||
* Develop an approach to access PODIO in training and inference that avoids the need for copying into different data structures. Since ONNX requires dense arrays for tensors, this may require modifications to PODIO so underlying types are mapped onto a struct-of-arrays. | ||
|
||
## Expected results | ||
|
||
* | ||
|
||
## Requirements | ||
|
||
* Python | ||
* C++ | ||
|
||
## Mentors | ||
|
||
* [Wouter Deconinck](mailto:[email protected]) | ||
* [Benedikt Hegner](mailto:[email protected]) | ||
* [Thomas Madlener](mailto:[email protected]) | ||
|
||
## Links | ||
|
||
* [PODIO](https://github.com/AIDASoft/podio) | ||
* [Key4hep](https://github.com/key4hep) |