openneuro_preproc

A group of workflows for using openneuro data

testing if I can edit from scinet

Steps required:

Before you start - set these environment variables
- clone this repo
- this is what the output will look like
Downloading data
Running fMRIprep
- setting up fMRIprep container and env

Before you start - all code in this workflow uses two environment variables

Destination - because we are trying to make this a general purpose repo - we will avoid the nesting as a subdataset bit I usually like so much?

if in the terminal - load the datalad module

Set these two environment variables to get everything going

## set OPENNEURO_DSID to the openneuro dataset id
OPENNEURO_DSID="ds000030"

## set the second environment variable to get the base directory
BASEDIR=$SCRATCH/openneuro_datasets

This is where data is sitting on the scc

BASEDIR=/external/rprshnas01/external_data/openneuro

cloning this repo into the home folder

To get started let's make sure we have this scripts in the code folder

mkdir ${BASEDIR}/code
cd ${BASEDIR}/code
git clone [email protected]:krembilneuroinformatics/openneuro_preproc.git

Note: for this to work - you need to add a ssh key for SciNet to your github.

instructions here for creating the key

The final structure this repo is coded to work with

${BASEDIR}
├── code
│   └── openneuro_preproc        # a clone of this repo
├── containers
│   └── fmriprep-20.2.7.simg     # the singularity image used to run fmriprep (need to run steps below to get this first!)
├── ${OPENNEURO_DSID}            # folder for the dataset
│   ├── bids                     # the bids data is the data downloaded from openneuro
│   ├── derived                  # holds derivatives derived from the bids data
│   └── logs                     # logs from jobs run on cluster
└── fmriprep_home                # an extra folder with pre-downloaded fmriprep templates (see setup section)

loading a datalad env on the scc

## git annex is already on all nodes
source /external/rprshnas01/netdata_kcni/edlab/venvs/datalad-0-15-5/bin/activate

loading datalad on SciNet niagara

## loading Erin's datalad environment on the SciNet system
module load git-annex/8.20200618 # git annex is needed by datalad
module use /project/a/arisvoin/edickie/modules #this let's you read modules from Erin's folder
module load datalad/0.15.5 # this is the datalad module in Erin's folder

using datalad to install a download a dataset

mkdir -p ${BASEDIR}/${OPENNEURO_DSID}/
cd ${BASEDIR}/${OPENNEURO_DSID}/
datalad clone https://github.com/OpenNeuroDatasets/${OPENNEURO_DSID}.git bids

The above bit would "clone" the dataset - meaning it will only download the little files and download instructions. To actually download the imaging data we need to use "datalad get".

This is useful - because we can limit downloading time/space by exploring the dataset and only downloading what we are really interested in.

Let's start by getting all the anatomical MRI images - we always need these

cd ${BASEDIR}/${OPENNEURO_DSID}/bids
datalad get */anat/*

next - let's grab the resting-state fMRI data and associated files. Under BIDS convension - they are always in the "func" folder and all have "task-rest" in their filename.

cd ${BASEDIR}/${OPENNEURO_DSID}/bids
datalad get */anat/*task-rest*

Running fmriprep

setting up the fmriprep environment

building fmriprep container on scinet

This step was run by Erin

module load tools/singularity/3.8.5 #(not necessary ot module load but run other steps); (gets recipe to do 'science' from docker)
# singularity build /my_images/fmriprep-<version>.simg docker://nipreps/fmriprep:<version>
mkdir ${BASEDIR}/containers
singularity build ${BASEDIR}/containers/fmriprep-20.2.7.simg \
                    docker://nipreps/fmriprep:20.2.7

The above step is downloading ALL the fmriprep software and putting it in a 'tupperware' container (according to Erin).

Testing and setting up for the singularity run..

We need a copy of the freesurfer license to be in: you can get htis from the freesrufer webiste or within the SCC (our option)

ls ${BASEDIR}/fmriprep_home/.freesurfer.txt

Testing the singularity binds..

cd $BASEDIR

singularity shell --cleanenv \
    -B ${BASEDIR}/fmriprep_home:/home/fmriprep --home /home/fmriprep \
    containers/fmriprep-20.2.7.simg

From inside the container - set up templateflow (note due this before submitting a job)

python -c "from templateflow.api import get; get(['MNI152NLin2009cAsym', 'MNI152NLin6Asym'])"
python -c "from templateflow.api import get; get(['fsaverage', 'fsLR'])"
python -c "from templateflow.api import get; get(['OASIS30ANTs'])"

submitting the fmriprep_anat step (scinet)

Note: this step uses and estimated 24hrs for processing time per participant! So if all participants run at once (in our parallel cluster) it will still take a day to run.

## note step one is to make sure you are on one of the login nodes
ssh niagara.scinet.utoronto.ca

## don't forget to make sure that $BASEDIR and $OPENNEURO_DSID are defined..

module load singularity/3.8.0
## go to the repo and pull new changes
cd ${BASEDIR}/code/openneuro_preproc
git pull

## calculate the length of the array-job given 
SUB_SIZE=5
N_SUBJECTS=$(( $( wc -l ${BASEDIR}/${OPENNEURO_DSID}/bids/participants.tsv | cut -f1 -d' ' ) - 1 ))
array_job_length=$(echo "$N_SUBJECTS/${SUB_SIZE}" | bc)
echo "number of array is: ${array_job_length}"

## submit the array job to the queue
cd ${BASEDIR}/${OPENNEURO_DSID}
sbatch --array=0-${array_job_length} ${BASEDIR}/code/openneuro_preproc/code/01_fmriprep_anat_scinet.sh

submitting the fmriprep func step (scinet)

Running the functional step looks pretty similar to running the anat step. The time taken and resources needed will depend on how many functional tasks exists in the experiment - fMRIprep will try to run these in paralell if resources are available to do that.

Note - the script enclosed uses some interesting extra opions:

it defaults to running all the fmri tasks - the --task-id flag can be used to filter from there
it is outputing cifti files (HCP fsLR91k space as well as MNI and native space outputs)
it is running synthetic distortion correction by default - instead of trying to work with the datasets available feildmaps - because feildmaps correction can go wrong.

## note step one is to make sure you are on one of the login nodes
ssh niagara.scinet.utoronto.ca

## don't forget to make sure that $BASEDIR and $OPENNEURO_DSID are defined..

module load singularity/3.8.0
## go to the repo and pull new changes
cd ${BASEDIR}/code/openneuro_preproc
git pull

## figuring out appropriate array-job size
SUB_SIZE=1 # for func the sub size is moving to 1 participant because there are two runs and 8 tasks per run..
N_SUBJECTS=$(( $( wc -l ${BASEDIR}/${OPENNEURO_DSID}/bids/participants.tsv | cut -f1 -d' ' ) - 1 ))
array_job_length=$(echo "$N_SUBJECTS/${SUB_SIZE}" | bc)
echo "number of array is: ${array_job_length}"

## submit the array job to the queue
cd ${BASEDIR}/${OPENNEURO_DSID}
sbatch --array=0-${array_job_length} ${BASEDIR}/code/openneuro_preproc/code/02_fmriprep_func_scinet.sh

running fmriprep on the scc

Before running this make sure that the fmriprep container exits and that you have set the freesurfer license instructions above

Also don't forget about setting the environment variables for $BASEDIR and $OPENNEURO_DSID

## note step one is to make sure you are on one of the submit nodes
ssh dev02

## don't forget to make sure that $BASEDIR and $OPENNEURO_DSID are defined..

## go to the repo and pull new changes
cd ${BASEDIR}/code/openneuro_preproc
git pull

## figuring out appropriate array-job size
N_SUBJECTS=$(( $( wc -l ${BASEDIR}/${OPENNEURO_DSID}/bids/participants.tsv | cut -f1 -d' ' ) - 1 ))
echo "number of array is: ${N_SUBJECTS}"

## submit the array job to the queue
cd ${BASEDIR}/${OPENNEURO_DSID}
sbatch --array=0-${array_job_length} ${BASEDIR}/code/openneuro_preproc/code/01_fmriprep_func_scc.sh

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
code		code
notebooks		notebooks
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

openneuro_preproc

Before you start - all code in this workflow uses two environment variables

cloning this repo into the home folder

The final structure this repo is coded to work with

loading a datalad env on the scc

loading datalad on SciNet niagara

using datalad to install a download a dataset

Running fmriprep

setting up the fmriprep environment

submitting the fmriprep_anat step (scinet)

submitting the fmriprep func step (scinet)

running fmriprep on the scc

cleaning? - hopefully w cifti clean - but there are options

parcellating data - with Shaefer

QCing everything that has been done

About

Releases

Packages

Languages

License

Shreyder93/openneuro_preproc

Folders and files

Latest commit

History

Repository files navigation

openneuro_preproc

Before you start - all code in this workflow uses two environment variables

cloning this repo into the home folder

The final structure this repo is coded to work with

loading a datalad env on the scc

loading datalad on SciNet niagara

using datalad to install a download a dataset

Running fmriprep

setting up the fmriprep environment

submitting the fmriprep_anat step (scinet)

submitting the fmriprep func step (scinet)

running fmriprep on the scc

cleaning? - hopefully w cifti clean - but there are options

parcellating data - with Shaefer

QCing everything that has been done

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages