Oncogenicity Variant Interpreter (OncoVI)

OncoVI is a fully-automated Python implementation of the oncogenicity guidelines by Horak et al. (Genetics in Medicine, 2022).

Starting from the genomic location of the variants, OncoVI:

performs functional annotation based on the Variant Effect Predictor (VEP) from Ensembl;
collects biological evidences from the implemented publicly available resources;
classifies the oncogenicity of somatic variants, based on the point-based system for combining pieces of evidence defined by Horak et al.

More detailed information on the resources used by OncoVI, the implementation of the oncogenicity guidelines, and its application to real-world data can be found in our pre-print.

Workflow of OncoVI:

The figure shows the implemented criteria in OncoVI (11 and five criteria for evidence of oncogenic and benign effect respectively), the public resources utilised to assess each criterion, the points associated with each criterion and the classification of oncogenicity into one of five classes on the basis of the variant-specific score, obtained as the sum of the points associated to the criteria triggered by OncoVI for the variant: score≥10:Oncogenic (O), 6≤score≤9:Likely Oncogenic (LO), 0≤score≤5:Variant of uncertain significance (VUS), -6≤score≤-1:Likely Benign (LB), score≤-7:Benign (B). Blue: resources suggested by the Standard Operating Procedure by Horak et al.; black: resources identified by the authors of OncoVI.

Software requirements

OncoVI was implemented and tested on a dedicated conda enviroment running on a remote server based on Ubuntu 20.04.4 long-term support (LTS) operating system. To run OncoVI the following packages are required:

python
numpy
pandas
subprocess

COSMIC resources

Due to size constraints, the COSMIC resources utilised by OncoVI could not be uploaded on the GitHub repo. How to download and handle COSMIC data to make them usable by OncoVI is described here below.

Cancer Mutation Census

First, All Data CMC for genome GRCh38 was downloaded
Then, the data set was reduced to the columns: GENE_NAME, Mutation CDS, Mutation AA, AA_MUT_START, Mutation genome position GRCh38, and COSMIC_SAMPLE_MUTATED
The reduced data set was converted into a dictionary with the python script /src/prepare_cosmic_resources.py
The resulted dictionary was saved under the name cosmic_all_dictionary.txt
The path to the dictionary must be provided to the python script 03_OncoVI_SOP.py

Census Genes Mutations

First, Census Genes Mutations for genome GRCh38 was downloaded
Then, the data set was reduced to the columns: GENE_SYMBOL, MUTATION_CDS, MUTATION_AA, and HGVSG
The reduced data set was converted into a dictionary with the python script /src/prepare_cosmic_resources.py
The resulted dictionary was saved under the name cosmic_hgvsg_dictionary.txt
The path to the dictionary must be provided to the python script 03_OncoVI_SOP.py

ClinVar resources

Due to size constraints the ClinVar resources utilised by the functional annotation STEP could not be uploaded on the GitHub repo. How to download and handle ClinVar data to make them usable by the functional annotation STEP is:

First, variant_summary.txt.gz was downloaded from the ftp site
Then, the data set was reduced to the columns: GeneSymbol, ClinicalSignificance, Chromosome, Start, VariationID, ReferenceAlleleVCF, AlternateAlleleVCF, ReviewStatus, and NumberSubmitters
The reduced data set was converted into a dictionary with the python script /src/create_clinvar_dict.py
The resulted dictionary was saved under the name clinvar_all_dictionary.txt
The path to the ClinVar dictionary must be provided to the python script 02_VEP_based_pipeline.py

Get started

Clone the GitHub repository:

git clone https://github.com/MGCarta/oncovi.git

# Create the conda environment for oncovi
conda env create -n oncovi -f /path/to/OncoVIenvFile.yml

# Activate the conda environment
conda activate oncovi

Set up VEP

# Run the installer (v. 111) available in the conda environment
vep_install --NO_HTSLIB -c '/path/to/.vep' -r '/path/to/.vep/Plugins/'

Then:

select homo_sapiens_refseq_vep_111_GRCh38.tar.gz as cache
homo_sapiens_vep_111_GRCh38.tar.gz as reference genome
install all Plugins

dbNSFP plugin

The dbNSFP plugin is used by the the functional annotation STEP. Detailed information on how to set up the dbNSFP plugin for VEP can be found here. The dbNSFP Plugin must be enabled in the script vep.sh according to the Plugin instructions.

spliceAI plugin

The spliceAI plugin is used during the the functional annotation STEP. Detailed information on how to set up the spliceAI plugin for VEP can be found here. The spliceAI Plugin must be enabled in the script vep.sh according to the Plugin instructions.

Prepare your variants

Both variants in text format and in variant call format (VCF) are accepted by VEP. Please refer to VEP official documentation for a detailed description of input formats. A test data is available under:

# /oncovi/testdata/SOP_table_union.txt

Perform functional annotation via VEP

# Navigate to the directory in which the python script 02_VEP_based_pipeline.py is located

# Run the functional annotation
python 02_VEP_based_pipeline.py -i /path/to/oncovi/testdata/SOP_table_union.txt

Run OncoVI

# Navigate to the directory in which the python script 03_OncoVI_SOP.py is located

# Run OncoVI
python 03_OncoVI_SOP.py

OncoVI issues

Please, help us to improve OncoVI by describing your bug/issue in detail

License

The License file applies to all files within this repository.

OncoVI is intended for research purposes only and its use outside of this context is under the responsibility of the user, who should also comply with licences of the resources utilised.

References

Please cite our preprint Oncogenicity Variant Interpreter (OncoVI): oncogenicity guidelines implementation to support somatic variants interpretation in precision oncology if you decide to use OncoVI.

Name		Name	Last commit message	Last commit date
Latest commit History 110 Commits
figures		figures
resources		resources
src		src
testdata		testdata
LICENSE		LICENSE
OncoVIenvFile.yml		OncoVIenvFile.yml
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Oncogenicity Variant Interpreter (OncoVI)

Software requirements

COSMIC resources

Cancer Mutation Census

Census Genes Mutations

ClinVar resources

Get started

Set up VEP

dbNSFP plugin

spliceAI plugin

Prepare your variants

Perform functional annotation via VEP

Run OncoVI

OncoVI issues

License

References

About

Releases

Packages

Languages

License

MGCarta/oncovi

Folders and files

Latest commit

History

Repository files navigation

Oncogenicity Variant Interpreter (OncoVI)

Software requirements

COSMIC resources

Cancer Mutation Census

Census Genes Mutations

ClinVar resources

Get started

Set up VEP

dbNSFP plugin

spliceAI plugin

Prepare your variants

Perform functional annotation via VEP

Run OncoVI

OncoVI issues

License

References

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages