Code for our paper Deep Learning Based HPV Status Prediction for Oropharyngeal Cancer Patients
Code for each of the three models is given in a separate branch:
C3D
: code for the video pre-trained modelCNN
: code for the 3D model trained from scratchVGG16
: code for the ImageNet pre-trained model
each branch involves a individual sequence.py
, parameter/par.yml
and model.py
file.
The files main.py
, metrics.py
and run.sh
are shared between branches.
parameter file
The parameter file parameter/par.yml
stores all the hyperparameter settings and file paths
to be used by other files.
sequence file
The sequence.py
file constructs a Tensorflow Sequence.
It reads in the complete CT image files and splits them in smaller sections, called bundles.
Bundles are stored on disk in order to be called by the __getitem__()
function during training.
model file
The model.py
file contains the respective Tensorflow model, get_model()
returns
the complete model to be used during training.
metrics file
The metrics.py
file holds all the metrics to be taped during training.
run file
The run.sh
file can be used to run the models within a docker container.
The file to construct the docker images is given in docker/hpv_status
.
main file
The main.py
has be be called to start the training.
Image input
The sequence.py
file expects the CT images to be stored in hdf5
file format
with the following structure:
./data/image_data/image_files.h5
|
|---ct_images
| |------pid_1
| |------pid_2
| ...
| |------pid_n
|
|
|---ct_sgmt
|------pid_1
|------pid_2
...
|------pid_n
with pid
given by the respective TCIA Subject IDs.
HPV status
Information about patients HPV status can be found in data/patient_data
.
The respective files also contain information about the cases used for training, validation and testing.
Pre-trained weights
The pre-trained networks expect the weights to be found at ./data/weights/<weight_file.h5>
.
Names of the files can be specified in parameter/par.yml
.
For the C3D model the weights can be downloaded as a BVLC caffe file
from the official web page.
In order to convert them to numpy/hdf5
format utils/convert.py
can be used.
A docker image to install caffe can be constructed with the file given in docker/convert_caffe
.
To run a container utils/convert.sh
can be used.
Alternatively, the weights can also be downloaded from here.
Weights for the VGG16 model can be downloaded from
here.