Skip to content

Commit

Permalink
update pip
Browse files Browse the repository at this point in the history
  • Loading branch information
azzatha committed Mar 10, 2022
1 parent 10e8c56 commit c287f5e
Show file tree
Hide file tree
Showing 3 changed files with 83 additions and 32 deletions.
111 changes: 81 additions & 30 deletions deepsvp.egg-info/PKG-INFO
Original file line number Diff line number Diff line change
@@ -1,49 +1,89 @@
Metadata-Version: 2.1
Name: deepsvp
Version: 1.0.0
Version: 1.0.3
Summary: DeepSVP: Integration of Genomics and Phenotypes forStructural Variant Prioritization using Deep Learning
Home-page: UNKNOWN
Author: Azza Althagafi
Author-email: [email protected]
License: Apache 2.0
Download-URL: https://github.com/bio-ontology-research-group/deepsvp/archive/v1.0.2.tar.gz
Download-URL: https://github.com/bio-ontology-research-group/deepsvp/archive/v1.0.3.tar.gz
Description: # DeepSVP
DeepSVP is a computational method to prioritize structural variants involved in genetic diseases by combining genomic information with information about gene functions. We incorporate phenotypes linked to genes, functions
of gene products, gene expression in individual celltypes, and
anatomical sites of expression, and systematically relate them to
their phenotypic consequences through ontologies and machine
learning
DeepSVP is a computational method to prioritize structural variants (SV) involved in genetic diseases by combining genomic information with information about gene functions. We incorporate phenotypes linked to genes, functions of gene products, gene expression in individual celltypes, and anatomical sites of expression. DeepSVP systematically relates them to their phenotypic consequences through ontologies and machine learning.

## Dataset
We train and evaluate our method using human genomic Structural Variation collected from [dbvar](https://ftp.ncbi.nlm.nih.gov/pub/dbVar/data/Homo_sapiens/by_assembly/GRCh38/vcf/) dataset.
## Training dataset
We train and evaluate our method using human SV collected from [dbvar](https://ftp.ncbi.nlm.nih.gov/pub/dbVar/data/Homo_sapiens/by_assembly/GRCh38/vcf/) dataset.

## Prediction the candidate CNVs workflow
We integrate the annotates from Gene ontology [GO](http://geneontology.org/docs/download-go-annotations/), Uber-anatomy ontology
[UBERON](https://www.ebi.ac.uk/ols/ontologies/uberon), Mammalian Phenotype ontology [MP](http://www.informatics.jax.org/vocab/mp_ontology), and Human Phenotype Ontology [HPO](https://hpo.jax.org/app/download/annotation) using [DL2vec](https://github.com/bio-ontology-research-group/DL2Vec). We convert different types of Description Logic axioms into graph representation, and then generate an embedding for each node and edge type.
We collected genomics features using public tool [AnnotSV (v2.3 or 2.2)](https://lbgi.fr/AnnotSV/annotations).
## Annotation data sources (integrated in the candidate SV prediction workflow)
We integrated the annotations from different sources:
- Gene ontology ([GO](http://geneontology.org/docs/download-go-annotations/))
- Uber-anatomy ontology ([UBERON](https://www.ebi.ac.uk/ols/ontologies/uberon))
- Mammalian Phenotype ontology ([MP](http://www.informatics.jax.org/vocab/mp_ontology))
- Human Phenotype Ontology ([HPO](https://hpo.jax.org/app/download/annotation))

This work is done using [DL2vec](https://github.com/bio-ontology-research-group/DL2Vec). We convert different types of Description Logic axioms into graph representation, and then generate an embedding for each node and edge type.

We collected [genomics features](https://lbgi.fr/AnnotSV/annotations) using the [AnnotSV (v2.2)](https://lbgi.fr/AnnotSV/downloads) public tool.


## Installation
Using pip version 20.3.1:
```
pip install deepsvp
```

## Running the prediction model
- Download all the files in [data](https://bio2vec.cbrc.kaust.edu.sa/data/DeepSVP/) and place them into data folder.
- Download and install the required database [AnnoSV (v2.3 or 2.2)](https://lbgi.fr/AnnotSV/downloads), and then run:
```
bash scripts/annotation.sh -i input.vcf -o annotated_file
```
and place the annotated VCF file into data folder.
Or you can create a specific Conda Environments (e.g. named "deepsvp-py38-pip2031"):
```
conda create -n deepsvp-py38-pip2031 python=3.8 pip=20.3.1
conda activate deepsvp-py38-pip2031
pip3 install deepsvp
pip3 install networkx
pip3 install torch
pip3 list
conda deactivate
```

## Running the DeepSVP prediction model
- Download all the files from [data](https://bio2vec.cbrc.kaust.edu.sa/data/DeepSVP/) and place the uncompressed files/repository in the folder named "data":
```
mkdir DeepSVP/ ;# /path_of_your_DeepSVP_repository/
cd DeepSVP
wget "https://bio2vec.cbrc.kaust.edu.sa/data/DeepSVP/data.zip"
unzip data.zip
cd data ;# /path_of_your_DeepSVP_data_repository/
wget "https://bio2vec.cbrc.kaust.edu.sa/data/DeepSVP/experiments.zip" # can be very long
unzip experiments.zip
```
- Download and install the required [AnnoSV (2.3)](https://lbgi.fr/AnnotSV/downloads) tool in the "data" folder:
```
cd /path_of_your_DeepSVP_data_repository/
git clone [email protected]:lgmgeo/AnnotSV.git --branch v2.3
cd AnnotSV/
make PREFIX=. install
make DESTDIR= PREFIX=. install-human-annotation
cd ..
```

- Add genomic features to your VCF input file (/path_and_name_of_your_vcf_input_file/) thanks to AnnotSV (v2.3):

e.g. /path_and_name_of_your_vcf_input_file/ = ./input.vcf

e.g. /path_and_name_of_your_annotsv_output_file/ = ./data/output.annotsv.annotated.tsv

```
bash
export ANNOTSV=/path_of_your_DeepSVP_data_repository/AnnotSV
$ANNOTSV/bin/AnnotSV -SVinputFile ./input.vcf -genomeBuild GRCh38 -outputFile ./data/output.annotsv.annotated.tsv
```
Your annotated VCF file (./data/output.annotsv.annotated.tsv) should be placed in the data folder (/path_of_your_DeepSVP_data_repository/).

- Run the command `deepsvp --help` to display help and parameters:
```
Usage: main.py [OPTIONS]
```
Usage: deepsvp [OPTIONS]

DeepSVP: A phenotype-based tool to prioritize caustive CNV using WGS data
and Phenotype/Gene Functional Similarity
DeepSVP: A phenotype-based tool to prioritize caustive CNV using WGS data
and Phenotype/Gene Functional Similarity

Options:
Options:
-d, --data-root TEXT Data root folder [required]
-i, --in-file TEXT Annotated Input file [required]
-p, --hpo TEXT List of phenotype ids separated by commas
Expand All @@ -55,13 +95,22 @@ Description: # DeepSVP
-ag, --aggregation TEXT Aggregation method for the genes within CNV (max
or mean) default=max
-o, --outfile TEXT Output result file
--help Show this message and exit.

```
--help Show this message and exit.
```

### Example:
- Run the example (with you own HPO terms):
```
deepsvp -d data/ -i output.annotsv.annotated.tsv -p HP:0003701,HP:0001324,HP:0010628,HP:0003388,HP:0000774,HP:0002093,HP:0000508,HP:0000218 -m cl -maf 0.01 -ag max -o example_output.txt
```
Or run the example with the deepsvp-py38-pip2031 Conda Environment:
```
conda activate deepsvp-py38-pip2031
deepsvp -d data/ -i $your_annotsv_output.annotated.tsv -p HP:0003701,HP:0001324,HP:0010628,HP:0003388,HP:0000774,HP:0002093,HP:0000508,HP:0000218 -m cl -maf 0.01 -ag max -o example_output.txt
conda deactivate
```
Or by using [cwl-runner](https://github.com/common-workflow-language/cwltool), modify the input file in the input example yaml [deepsvp.yaml](https://github.com/bio-ontology-research-group/DeepSVP/blob/master/deepsvp.yaml) file and then run:

deepsvp -d data/ -i example_annotsv.tsv -p HP:0003701,HP:0001324,HP:0010628,HP:0003388,HP:0000774,HP:0002093,HP:0000508,HP:0000218 -m cl -maf 0.01 -ag max -o example_output.txt
cwl-runner deepsvp.cwl deepsvp.yaml

```
|======== | 25% Reading the input phenotypes...
Expand All @@ -70,6 +119,8 @@ Description: # DeepSVP
|================================| 100% DONE! You can find the prediction results in the output file: example_output.txt
```



#### Output:
The script will output a ranking a score for the candidate caustive CNV.

Expand Down
Binary file added dist/deepsvp-1.0.3-py3-none-any.whl
Binary file not shown.
4 changes: 2 additions & 2 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,13 +27,13 @@

setup(
name="deepsvp",
version="1.0.2",
version="1.0.3",
description="DeepSVP: Integration of Genomics and Phenotypes forStructural Variant Prioritization using Deep Learning",
long_description=open(README).read(),
long_description_content_type="text/markdown",
author="Azza Althagafi",
author_email="[email protected]",
download_url="https://github.com/bio-ontology-research-group/deepsvp/archive/v1.0.2.tar.gz",
download_url="https://github.com/bio-ontology-research-group/deepsvp/archive/v1.0.3.tar.gz",
license="Apache 2.0",
packages=["deepsvp",],
package_data={"deepsvp": [],},
Expand Down

0 comments on commit c287f5e

Please sign in to comment.