Skip to content

docker container version of scsnl preprocessing pipeline

Notifications You must be signed in to change notification settings

cdla/scsnl_preproc_docker

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

scsnl_preproc_docker

docker container version of scsnl preprocessing pipeline

Objective

The goal is to begin dockerizing the available workflows located at the SCSNL github page , starting with the preprocessing workflow. Dockerization/containerization of workflows allows for scalable, reproducible workflows that can be run on any environment that supports docker.

Installation / Requirements

Usage

  1. check whether docker is installed and running
docker info
  1. clone the latest github version, by either clicking here, or running
git clone [email protected]:cdla/scsnl_preproc_docker.git
  1. unzip the directory (if needed) and cd into the directory
unzip scsnl_preproc_docker.zip;
cd scsnl_preproc_docker
  1. build the docker image
docker build -t scsnl/preproc_spm12 .
  1. (not tested/functional, but this is how the docker workflow would be run) run the workflow, with the three arguments indicating the location of the config_file.m, the data_dir, and the output_dir
docker run -t scsnl/preproc_spm12 -v /oak/project_location:/project_dir/ -v /oak/raw_data_location/:/raw_data/ -v /oak/output_dir/:/output_dir/ -v /oak/config_file_location.txt:/config.m subject_index

Future Directions

  • use docker2singularity to create a singularity image so that the docker workflow can run in research computing clusters such as Sherlock.
  • generate the same environment using neurodocker

Thought Process

Upon doing some preliminary research, it looks like there are some field tools that have already done the lion's share of work in this process including:

  • the spm bids-app
    • I was surprised when I ran across this tool. Using the spm12 standalone MCR version, it runs a preset preprocessing pipeline, that has been written in spm_batch format already.
  • neurodocker
    • this tool looks particularly interesting and I will likely make time to familiarize myself with it. It's a command line tool that generates Dockerfiles and Singularity images. Ultimately dockerization of workflows typically also needs to be followed by the creation of singularity images because singularity images can be run on university hpcc resources like Sherlock. Docker containers/daemons require root/admin power, whereas singularity images can be run without that requirement.
      • common tool to translate docker containers to singularity images is docker2simularity.
  • the official spm docker
    • This repository looks like its been created three months ago, and recently updated, to include both the Matlab Compiled Runtime (MCR) version as well as the octave version. SPM Documentation says that officially SPM is not supported, and there are currently some known issues as indicated here .

Possible Routes to Dockerizing Workflow

I. use official spm dockerfile

II. translate existing pipeline to nipype and then dockerize nipype environment with something like neurodocker

III. create MCR version of preprocessing scripts and add to official spm docker

IV. use neurodocker framework to create environment

Choosing a Route:

route I.

  • the official spm docker works mainly off spm_batch formatted language. I would need to figure out how to translate the wrapped command line functions such as this, as well as how to translate the pipeline's use of fsl to its spm analogues like when reorienting the data/"FlipZ", like here)

route II

  • this route is the one that I would be most comfortable with, given my relative comfort with nipype as compared to other frameworks. I think that this route would take the longest.

route III I chose to go this route

  • this route I think will have the best translation / easiest for users to who are familiar with the existing pipeline to translate over to

  • For this route, the goal is to create a dockerfile that has:

    • spm12 standalone (mcr version) - source
    • fsl 5.0.10 - source or something similar
    • scsnl preprocessing scripts - source
      • include ArtRepair toolbox website
  • will need to modify existing scripts to use the spm12 mcr

route IV

  • neurodocker seems like a very useful tool that I should familiarize myself with. I could see myself using this tool in the future.

Plan

  • make a dockerfile that supports SPM12 MCR and FSL (relevant commit)

  • remove/update script locations to docker relevant places (relevant commit)

  • modify existing scripts to take data location/project location/ output location as arguments for command (relevant commit)

    • due to the nature of containers, these directories will have to be mounted as volumes within the container.
  • within preproc functions and utils, remove filepaths (fsl commands and added toolboxes) (relevant commit)

  • modify existing scripts to change spm_run locations to unix matlab commands that invoke spm12-mcr compiled versions standalone usage docs

    • example:
    spm_jobman('run', BatchFile);
    

    turns to

    system(sprintf('spm12 batch %s',BatchFile));
    

(relevant commit)

  • compile the SCSNL preprocessing scripts (including ARTRepair toolbox) into an executable using mcc . (relevant script)(relevant commit)

  • test that spm functions, artrepair functions, and unix/fsl functions are running appropriately within the matlab compiled app on a sample dataset

    • this will determine whether matlab runtime compiler and docker interaction requires workflow restructure to handle passing data to the container.
  • integrate scsnl standalone app into dockerfile (add volume mounts from modified scsnl preproc scripts)

  • test dockerfile

  • comparison against non-dockerized version of scripts to make sure no hidden bugs arise.

Possible Issues:

  • coreg function references OldNorm as templates for spm12 workflow, which I would need to get a copy of that nifti
  • verify ARTRepair version (spm8 version referenced within workflow)
  • slicetiming file being optional (how to work with dockerized file mappings)
  • if mcr mapping of filepaths does not work with sample dataset, restructure file path mappings to be done in dockerfile instead of within preprocessingfmri.m wrapper.

About

docker container version of scsnl preprocessing pipeline

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages