-
Notifications
You must be signed in to change notification settings - Fork 0
MDM Schema
MDM is an acronym for MetaRefSGB Data Model. It has a central role in the organisation of all data processed by MetaRefSGB and it is designed to guarantee the data and metadata consistency, additionally to their integrity and availability.
It is a set of JSON schemes required for validating the MAGs, genomes, and metadata definition files that compose the MetaRefSGB resources.
If you want to submit an update request, you should first know how to structure the files that define the set of new genomes that will be processed by MetaRefSGB and then clustered into SGBs.
The MDM Components represent an abstraction of two main entities (plus a supplementary one) that compose every single update request submitted to MetaRefSGB:
-
genome: this is the abstraction level that represent a
Reference Genome
entity; -
MAG: abstraction level for the
Metagenome-assembled Genome
entity; - metadata: additional information about samples.
You can run the the following command in your terminal in order to automatically serialize and validate your data:
MetaRefSGB --mags=~/MAGs.txt \
--genomes=~/genomes.txt \
--metadata=~/metadata.txt \
--validate-input
You may want to validate your data all within the same command or one-by-one. If you need to validate just a single file, e.g. ~/metadata.txt
, you can run the following command:
MetaRefSGB --metadata=~/metadata.txt \
--validate-input
This will convert every line of the input file into a JSON object and will validate all of them against the JSON scheme.
To propose any change to the structure of the MDM schemes (include new properties or modify an existing one), please open an issue or a pull-request and we will reply as soon as the MetaRefSGB team will evaluate your proposal.