using a dataset in an analysis #188
-
What is the best practice when doing an analysis that relies on a dataset, say EL1000? |
Beta Was this translation helpful? Give feedback.
Replies: 4 comments 8 replies
-
The best practice is to do dataset nesting. You can do this from your analysis subfolder: datalad install -d . --source='https://gin.g-node.org/EL1000/EL1000' -r The -r flag is here because EL1k contains subdatasets itself. You can have a look at Then you can then cd into EL1000 and do what you need (e.g. download the files required by your analysis): cd EL1000
datalad run-procedure setup
datalad get */annotations/*/converted If EL1000 is updated, you will need to pull the changes in your analysis:
(replace main with whatever branch is the one you need) DataLad's handbook dedicates a few sections to dataset nesting: |
Beta Was this translation helpful? Give feedback.
-
Hello,
datalad install does not download the data by default. You still need to do
:
```bash
datalad get */annotations/*/converted
```
Does it work for you ?
Le mar. 27 avr. 2021 à 14:54, Alex Cristia ***@***.***> a
écrit :
… okay, in full disclosure, I had already started from code that seemed to
have an EL1000 folder, so I had cd'd into that, then set up correctly the
datasets I needed from EL1000, including downloading all the annotations.
So as to not lose that work, I moved my copy of EL1000 (which was a
folder, not a proper nested dataset) outside of my work directory. I then
attempted the set up, thinking that if that worked, instead of getting the
annotations, I would move them from my local copy. However, run-procedure
gave me an error:
source ~/ChildProjectVenv/bin/activate
mv EL1000 ..
datalad install -d . --source='https://gin.g-node.org/EL1000/EL1000' -r
cd EL1000
datalad run-procedure setup
[ERROR ] No idea how to execute procedure
/Users/acristia/Documents/gitrepos/EL1000-CR/EL1000/.datalad/procedures/setup.py.
Missing 'execute' permissions? [run_procedure.py:*call*:435] (ValueError)
I don't see anything wrong with the installation output:
$ datalad install -d . --source='https://gin.g-node.org/EL1000/EL1000' -r
[INFO ] Scanning for unlocked files (this may take some time)
install(ok): EL1000 (dataset)
[INFO ] Installing
Dataset(/Users/acristia/Documents/gitrepos/EL1000-CR/EL1000) to get
/Users/acristia/Documents/gitrepos/EL1000-CR/EL1000 recursively
Installing: 0.00 datasets [00:00, ? datasets/s] Warning: untrusted X11
forwarding setup failed: xauth key data not
generated███████████████████████████████| 3.00/3.00 [00:00<00:00, 3.95
Candidate locations/s]
Warning: untrusted X11 forwarding setup failed: xauth key data not
generated
[INFO ] Scanning for unlocked files (this may take some time)
Installing: 0.00 datasets [00:29, ? datasets/s]Warning: untrusted X11
forwarding setup failed: xauth key data not generated
[INFO ] Reset branch 'main' to bff10484 (from 5cc84948) to avoid a
detached HEAD
install(ok): EL1000/bergelson (dataset)
Installing:
100%|███████████████████████████████████████████████████████████████████████████████████████████████████|
1.00/1.00 [00:00<00:00, 1.08k datasets/sWarning: untrusted X11 forwarding
setup failed: xauth key data not generated███████████████████████████████|
3.00/3.00 [00:00<00:00, 4.08 Candidate locations/s]
Warning: untrusted X11 forwarding setup failed: xauth key data not
generated
[INFO ] Scanning for unlocked files (this may take some time)
Installing:
100%|████████████████████████████████████████████████████████████████████████████████████████████████████|
1.00/1.00 [00:19<00:00, 19.6s/ datasets]Warning: untrusted X11 forwarding
setup failed: xauth key data not generated
install(ok): EL1000/kidd (dataset)
Installing:
100%|███████████████████████████████████████████████████████████████████████████████████████████████████|
2.00/2.00 [00:00<00:00, 1.90k datasets/sWarning: untrusted X11 forwarding
setup failed: xauth key data not generated███████████████████████████████|
3.00/3.00 [00:00<00:00, 4.10 Candidate locations/s]
Warning: untrusted X11 forwarding setup failed: xauth key data not
generated
[INFO ] Scanning for unlocked files (this may take some time)
Installing:
100%|████████████████████████████████████████████████████████████████████████████████████████████████████|
2.00/2.00 [00:28<00:00, 14.2s/ datasets]Warning: untrusted X11 forwarding
setup failed: xauth key data not generated
[INFO ] Reset branch 'main' to 41fb635e (from 93004e6b) to avoid a
detached HEAD
install(ok): EL1000/lucid (dataset)
Installing:
100%|███████████████████████████████████████████████████████████████████████████████████████████████████|
3.00/3.00 [00:00<00:00, 3.25k datasets/sWarning: untrusted X11 forwarding
setup failed: xauth key data not generated███████████████████████████████|
3.00/3.00 [00:00<00:00, 4.11 Candidate locations/s]
Warning: untrusted X11 forwarding setup failed: xauth key data not
generated
[INFO ] Scanning for unlocked files (this may take some time)
Installing:
100%|████████████████████████████████████████████████████████████████████████████████████████████████████|
3.00/3.00 [00:04<00:00, 1.47s/ datasets]Warning: untrusted X11 forwarding
setup failed: xauth key data not generated
[INFO ] Reset branch 'main' to 1289fb37 (from d6935ebd) to avoid a
detached HEAD
install(ok): EL1000/warlaumont (dataset)
Installing:
100%|███████████████████████████████████████████████████████████████████████████████████████████████████|
4.00/4.00 [00:00<00:00, 4.23k datasets/sWarning: untrusted X11 forwarding
setup failed: xauth key data not generated███████████████████████████████|
3.00/3.00 [00:00<00:00, 3.93 Candidate locations/s]
Warning: untrusted X11 forwarding setup failed: xauth key data not
generated
[INFO ] Scanning for unlocked files (this may take some time)
Installing:
100%|████████████████████████████████████████████████████████████████████████████████████████████████████|
4.00/4.00 [00:02<00:00, 1.59 datasets/s]Warning: untrusted X11 forwarding
setup failed: xauth key data not generated
[INFO ] Reset branch 'main' to ae94c585 (from 64880a92) to avoid a
detached HEAD
install(ok): EL1000/winnipeg (dataset)
action summary:
install (ok: 6)
save (notneeded: 1)
Another interesting observation is that the annotations already seem to be
there in spirit but not in body:
$ ls bergelson/annotations/vtc/converted/
(gives me a long list of files, including the one I used in the next
command:)
$ more bergelson/annotations/vtc/converted/123972-9997_1_0_0.csv
bergelson/annotations/vtc/converted/123972-9997_1_0_0.csv: No such file or
directory
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#188 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABI4QHTM6SQ6DYXF3SXUQELTK2XXPANCNFSM43OMSAQA>
.
--
Lucas Gautheron
06 79 23 86 47
Laboratoire de Sciences Cognitives et Psycholinguistique
Bureau 414 - 29 rue d'Ulm
75005 Paris
|
Beta Was this translation helpful? Give feedback.
-
Btw the issue with the procedure script was my bad, I just fixed it, you;ll need to update EL1k (git pull origin main --recurse-submodules |
Beta Was this translation helpful? Give feedback.
-
Just tagging here that the only thing left open in this discussion is what to do with other copies of datasets in your local folders that you'd like to get rid. (Or perhaps my question is more precise than that, and it relates to having poorly installed and/or uninstalled the datasets) |
Beta Was this translation helpful? Give feedback.
The best practice is to do dataset nesting.
In other words, you should install EL1000 as a subdataset of your analysis.
You can do this from your analysis subfolder:
The -r flag is here because EL1k contains subdatasets itself. You can have a look at
datalad install
's documentation here: http://docs.datalad.org/en/stable/generated/man/datalad-install.htmlThen you can then cd into EL1000 and do what you need (e.g. download the files required by your analysis):
If EL1000 is updated, you will need to pull the changes in your analysis: