Dataset management module contents #10

mslw · 2021-10-21T13:29:05Z

Summary:

This would be on day 2 (so 3rd or 4th module of the workshop).
The original idea was to include dataset nesting and datalad run, but run fits better in the 1st module (List of modules #9)
Within this module we could (rough idea - I don't have a clear vision)
- talk about YODA
- walk through an example of dataset nesting (what would be the example?)
- maybe use the time to introduce datalad containers run?
- maybe limit the contents to allow some time for questions?

Questions:

Should this come before or after the data publishing / consumption module?
What should be the contents?

The text was updated successfully, but these errors were encountered:

adswa · 2021-10-26T09:14:24Z

walk through an example of dataset nesting (what would be the example?)

I like to use actual published analyses that follow the YODA principles, eg https://github.com/lnnrtwttkhn/highspeed-analysis

Should this come before or after the data publishing / consumption module?

I have a slight preference for after, because consumption is a very immediate, easy benefit, and having installed a dataset primes for installing datasets as subdatasets and the differences when its a dataset hierarchy.

mslw · 2021-12-14T13:00:10Z

I think i can write part of the module around exploring the lightspeed-analysis dataset and installing subdatasts with datalad get --no-data, but I'd also like to include a toy example of building such a nested dataset from ground up, similar to what's done in the first module (probably with create rather than clone), but I don't have a good idea yet - can you suggest something?

Other than that, I think I could use some help with the remainder of this module. Perhaps some space for more general concepts?

adswa · 2021-12-14T13:13:58Z

Maybe it could use https://github.com/datalad-handbook/repro-paper-sketch/, a template that @m-wierzba and I once created. In addition to nesting a dataset for analyses, its also about reproducible manuscripts. It could be rewritten to not rely on make, or to use containers-run in addition. But thats just a quick brainstorming - feel free to ignore if that's out of scope.

mslw · 2021-12-14T13:35:55Z

Maybe it could use https://github.com/datalad-handbook/repro-paper-sketch/, a template that @m-wierzba and I once created. In addition to nesting a dataset for analyses, its also about reproducible manuscripts.

Just to make sure - you mean building something like this from scratch? I'll take a closer look.

It could be rewritten to not rely on make, or to use containers-run in addition

Good idea. I'll need to work through that to see how long it may take. I'm also tempted by containers run, but sometimes less is more.

But thats just a quick brainstorming - feel free to ignore if that's out of scope.

That's what we need - I feel there's some space left for something other than datalad create -d . something and datalad get --no-data.

mslw · 2021-12-14T13:39:14Z

As an alternative, we could just present the basics of subdatasets, and include some of more general dataset management themes posted by @jsheunis in #9

I wonder which would be more helpful assuming a very basic audience.

mslw · 2022-01-14T17:35:07Z

Thanks for the suggestions. The module is now complete, but the issue can be reopened for future tweaks.

mslw added the content discussion Discussion regarding course content label Oct 21, 2021

mslw closed this as completed Jan 14, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dataset management module contents #10

Dataset management module contents #10

mslw commented Oct 21, 2021

adswa commented Oct 26, 2021

mslw commented Dec 14, 2021

adswa commented Dec 14, 2021

mslw commented Dec 14, 2021

mslw commented Dec 14, 2021

mslw commented Jan 14, 2022

Dataset management module contents #10

Dataset management module contents #10

Comments

mslw commented Oct 21, 2021

adswa commented Oct 26, 2021

mslw commented Dec 14, 2021

adswa commented Dec 14, 2021

mslw commented Dec 14, 2021

mslw commented Dec 14, 2021

mslw commented Jan 14, 2022