This repo contains scripts to download and process data from the Cancer Cell Line Encyclopedia and the Genentech Cell Line Screening Initiative for the 'Computational Discovery of Molecular Markers for Drug Response in Cancer' workshop at Tech + Research at Technica 2018.
This repo requires conda
.
Install and activate the conda environment with:
conda env create -f environment.yml
conda activate technica-data
Download and process raw data from Cancer Cell Line Encyclopedia:
snakemake all
Please find processed data in data/processed/
which will contain:
cell_line_list.tsv
: a list of cell lines from Cancer Cell Line Encyclopedia used for our workshopgene_expression.tsv
: gene expression data for each of the cell linesmutations.tsv
: gene mutation data for each of the cell linesdrug_response/
: drug response data for each of the cell lines for each of the 16 drugs