Self-supervised learning for acute ischemic stroke final infarct lesion segmentation in Non-Contrast CT
This repository contains all the code developed for obtaining the Master is Science degree from The Erasmus Mundus Joint Masters Program on Medical Image and Applications (MAIA) in a project conducted jointly with icometrix.
The full thesis manuscript can be checked here.
Main contributors:
- Joaquin Oscar Seia
- Ezequiel de la Rosa
- Diana Sima
- David Robben
Copy thesis abstract
This repository is organized in the following way:
- data and dataset: These folders contain all the code for cleaning and using the StrokeDataset class which is a uniformed way of managing the files of the different datasets used in this project.
- nnUnet: This folder contains code which is almost entirely the same as provided in nnUNet's repository. This was originally a fork from that repoository but later it was merged with the specific codebase of the project in order to have a single repository. Main moddifications to the original nnUNet codebase take place in:
- nnUnet/nnunetv2/training/nnUNetTrainer/variants/cfg_file_based/: allowing SSL pretrained weights to be loaded correctly
- nnUnet/nnunetv2/run/load_pretrained_weights.py: Where additional configurations are used to use tailor nnUNet training pipeline for the SSL pretrained scenario.
- nnUnet/nnunetv2/training/logging/nnunet_logger.py: Adding tensorboard logging to nnUNet
- nnUnet/ssl: Contains the code necessary for training in SSL fashion nnUNet's encoder. This code was originally based on DeSD original implementation. However, the code was significantely changed.
- preprocessing: Contains the implementation of the robust preprocessing pipeline described in the thesis text. In this portion of the repository, several other repositories were used as basis:
- In preprocessing/registration wrappers will be found for Elastix software.
- In preprocessing/skull_stripping, preprocessing/super_resolution, preprocessing/tissue_segmentation and preprocessing/freesurfer_utils.py. The code is build over Freesurfer's python codebase.
- utils: Contains metrics, plots and general utils used across the complete repository.
In this repository we provide all the code used to train the models. However, a great part of this thesis involved the use of a private dataset (icoAIS) and a two datasets which required authorization to be downloaded and used (CENTER-TBI, APIS). As a consequence, the results are not fully reproducible, neither the data or the pretrained weights are shared.
In order to use the code provided here in one of the publicly available datasets or in one of your own:
- Setting up the environments
- Download and reorganize the image files
- Preprocess the data
- Generate the nnUNet datasets
- Train nnUNet's encoder with SSL
- Train nnUNet in supervised fashion
(The usage of anaconda environments manager is assumed)
Two environments will be necessary. One for preprocessing the data and another one for running the SSL pretraining and the training of nnUNet. The reason for this is some icompatible versions between nnUNet and some of the models/code used during preprocessing.
First, update conda and change the solver to limamba for faster solving of requirements:
conda update -n base conda &&
conda install -n base conda-libmamba-solver &&
conda config --set solver libmamba &&
source ~/anaconda3/bin/activate
Create stroke environment and install requirements:
conda create -n stroke python==3.9 anaconda &&
conda activate stroke &&
conda install pytorch==1.13.1 torchvision==0.14.1 torchaudio==0.13.1 pytorch-cuda=11.6 -c pytorch -c nvidia &&
pip install -r requirements.txt
Create nnunet_gpu environment and install requirements:
conda create -n nnunet_gpu python==3.9 anaconda &&
conda activate nnunet_gpu &&
cd nnUNet &&
pip install -e .
The only datasets that can be currently download is AISD.
Once you get access to the data. The files will be standarized in name and fileformat and the proper daset csv will be generated by running:
export PYTHONPATH="${PYTHONPATH}:[<PATH_TO_THIS_PROJECT>]"
python data/clean_datasets.py -sdp '<SOURCE_DATA_PATH>' -bdp '<BIDS_DATA_PATH>'
The code is provided to process APIS and CENTER TBI data in case you get access to it.
Once the data is organized in the standard format, the preprocessing can be run.
For each dataset a configuration file needs to be defined, examples are provided for the publicly available datasets in preprocessing/cfg_files
For some stages of the preprocessing you will need:
- Download MNI Vascular regions template
- Download MNI NCCT template
- Install Elastix:
- Download Elastix
- Add elastix binaries to PATH variable:
base="<PATH_TO_ELASTIX/elastix-5.1.0-Linux" && export PATH="${base}/bin":$PATH && export LD_LIBRARY_PATH="${base}/lib":$LD_LIBRARY_PATH
- Make sure Elastix can be run:
elastix -h
- Download SynthStrip weights into preprocessing/skull_stripping/models
- Download SynthSR weights into preprocessing/super_resolution/models
- Download SynthSeg weights into preprocessing/tissue_segmentation/models
Once everything is downloaded and elastix is working preprocessing can be run by:
python 'preprocessing/preprocess_dataset.py' -ppcfg 'preprocessing/cfg_files/preprocessing_cfg_aisd.yml'
This can be done following the nnUNet dataset jupyter notebook. Both the SSL and the supervised datasets should be obtained.
First adecaute the configuration file according to the SSL experiment you want to run.
Then the ssl training can be done by running:
bash nnUnet/ssl/run_ssl.sh
This can be done by adapting accordingly and running:
bash nnUnet/train_nnunet_example.sh
Once the model is trained, the inferece can be obtained by using nnUNet's terminal command (check nnUNet documentation).