Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Meta] RO-Crate profile implementation challenges/requirements #490

Open
kMutagene opened this issue Feb 4, 2025 · 1 comment
Open

[Meta] RO-Crate profile implementation challenges/requirements #490

kMutagene opened this issue Feb 4, 2025 · 1 comment
Assignees

Comments

@kMutagene
Copy link
Member

kMutagene commented Feb 4, 2025

Feel free to improve, correct, and add @muehlhaus @HLWeil

Implementation of our RO-crate profiles is a major milestone for the whole ARC ecosystem. It unifies the CWL and ISA worlds, enables querying both via the new LabProcess concept, and enables a new potentially 100% machine-generated entry point for the ARC.

After a lot of prototyping and discussions, here are the requirements for our data model implementation:

Requirements

  1. It needs to be transpilable from F# to JS and python
    We want to have a first-class implementation of ARCtrl in all 3 languages. Fable makes this possible without 3 separate sources, but comes with a lot of limitations and necessary little "hacks". We gain a single F# codebase, but with the tradeoff that it is hard to maintain.

  2. JSON-LD is dnamically typed. We need an API layer that can handle missing properties every time
    Dynamic typing is specially hard in a strongly typed language such as F#. We put in a lot of work to make DynamicObj transpilable, leading to objects (and derived hybrid classes) that have settable arbitrary properties on runtime, while transpiling to the respective native concepts in JS/Python. Based on this concept, we can implement LDObject, which in addition has fields for @type and @id , the only fieldsthat MUST be present in a valid JSON-LD object. In a first step, we want to parse JSON-LD files into a nested tree of LDObject.

  3. We need static classes representing our profiles, with the possibility of adding additional arbitrary information.
    Our RO-Crate profiles define the properties of objects we expect/can handle in a defined manner. Once the LDObject tree is translated into that world, it is OK to have static properties on classes, and it is okay to fail their creation when mandatory properties are not present. On the other hand, any information that is there in addition to the properties we can handle must be preserved. For this, classes inheriting from LDObject that represent our profile classes must be created. These will then maintain the possibility of arbitrary runtime props via DynamicObj.

Implementation details

  • Base layer minimally typed base layer of LDObject as the target of parsing RO-Crate metadata json files
  • Access layer static classes methods to access and validate properties and objects on LDObject
  • Profile datamodel: type representation of the objects described in our RO-Crate profiles with mandatory/optional instance properties and dynamic properties

Open questions

  • How to unify the existing ARC Scaffold datamodels with the more dynamic RO Crate datamodels?
    One problem is that there is no mechanism to represent arbitrary additional information in our ISA/CWL representations in that model.
  • How to model arrays containing anyof multiple types?
    In the profile, there are properties defined as containing objects of type A OR B. This can be modelled e.g. via a type that "unifies" A AND B. other approaches might be better or necessary.
@kMutagene kMutagene converted this from a draft issue Feb 4, 2025
@kMutagene kMutagene moved this to In discussion in ARCStack Feb 4, 2025
@github-actions github-actions bot added the Status: Needs Triage This item is up for investigation. label Feb 4, 2025
@kMutagene kMutagene removed the Status: Needs Triage This item is up for investigation. label Feb 4, 2025
@HLWeil
Copy link
Member

HLWeil commented Feb 10, 2025

How to model arrays containing anyof multiple types?

Unifying type sounds promising. We could try to treat this not as a Union but as an Intersection. I.e. the type representing data or sample would only contain name as a fixed field.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: In discussion
Development

No branches or pull requests

3 participants