You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The metadata is mostly right, but there are a few opportunities for improvement:
the keywords are not separated as they should be
The Funding information lists "Award Number: NIH BRAIN Initiative (U01NS113252)" when it should just be the grant code, without the "NIH BRAIN Initiative" label
The paper related resource is annotated as "Resource type: dcite:Dataset" when it should clearly be "Resource type: dcite:Preprint".
The DANDI Archive is filled with these sorts of small mistakes that could be easily fixed by a curation process.
I would like a system that would allow a curator to suggest changes to the metadata of a Dandiset that they do not own.
My preferred approach would be that the curator could create a pull request on the metadata.yml file within the DataLab GitHub repo for that Dandiset (e.g. here). We could have an automatic email that goes to the maintainer (which is required for all Dandisets) alerting them to this curation PR and requesting approval. This could be done entirely manually at first (validating against the metadata schema) and we could gradually replace steps with API and LLM-based automations that a human curator oversees. I am not sure what machinery would be needed to update the archive metadata based on an approved PR, but I would hope that wouldn't be too difficult to do via a GitHub Action mechanism.
The text was updated successfully, but these errors were encountered:
@bendichter, how would you feel about the curator using a version of the meditor to create suggested edits that are then sent to the owners? Or even an in-archive JSON editor to perform "bulk" edits on the whole metadata (in lieu of doing a PR)?
I am wary of a DataLad PR approach because (1) I want to avoid a "two-way binding" between the Archive and the DataLadified version of the Dandisets and (2) that sounds complicated (I am not sure I want to deal with a Git diff and try to apply it to a JSON-represented metadata in a system that doesn't really store that as a JSON file).
@waxlamp I understand your desire to not rely on GitHub/DataLad for this. I'd like a solution where an API could be integrated and I'm alright with a clunky UI to start with. What if we just start with an API endpoint for where I can upload a new suggested dandiset.yaml? Or better yet, a subspec of the dandiset.yaml that only contains editable fields, leaving out fields like assetSummary, which is derived from the uploaded data, and Citation, which is derived by other metadata? Then I could make suggestions for fields like relatedResources, contributors, publishedBy, etc.? Then on the server side it would validate this dandiset.yaml and if it is valid will somehow submit this as a suggestion to the owner?
Here is an example of a great Dandiset: https://dandiarchive.org/dandiset/000957
The metadata is mostly right, but there are a few opportunities for improvement:
The DANDI Archive is filled with these sorts of small mistakes that could be easily fixed by a curation process.
I would like a system that would allow a curator to suggest changes to the metadata of a Dandiset that they do not own.
My preferred approach would be that the curator could create a pull request on the
metadata.yml
file within the DataLab GitHub repo for that Dandiset (e.g. here). We could have an automatic email that goes to the maintainer (which is required for all Dandisets) alerting them to this curation PR and requesting approval. This could be done entirely manually at first (validating against the metadata schema) and we could gradually replace steps with API and LLM-based automations that a human curator oversees. I am not sure what machinery would be needed to update the archive metadata based on an approved PR, but I would hope that wouldn't be too difficult to do via a GitHub Action mechanism.The text was updated successfully, but these errors were encountered: