From b0ee034c8d5214178f2dc259c323b2390f1c77f2 Mon Sep 17 00:00:00 2001 From: rxu17 <26471741+rxu17@users.noreply.github.com> Date: Mon, 11 Dec 2023 15:46:39 -0800 Subject: [PATCH 1/4] initial doc update --- CONTRIBUTING.md | 17 ++++++++++++++++- README.md | 28 +++++++++++++++++++++++----- 2 files changed, 39 insertions(+), 6 deletions(-) diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 96475f19..a08f9740 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -94,6 +94,10 @@ This package uses [semantic versioning](https://semver.org/) for releasing new v ### Testing +#### Running test pipeline + +Make sure to run each of the [pipeline steps here](README.md#developing-locally) on the test pipeline and verify that your pipeline runs as expected. This is __not__ automatically run by Github Actions and have to be manually run. + #### Running tests This package uses [`pytest`](https://pytest.org/en/latest/) to run tests. The test code is located in the [tests](./tests) subdirectory. @@ -134,6 +138,17 @@ Follow gitflow best practices as linked above. 1. Merge `main` back into `develop` 1. Push `develop` -### DockerHub +### Modifying Docker + +Follow this section when modifying the [Dockerfile](https://github.com/Sage-Bionetworks/Genie/blob/develop/Dockerfile) + +1. Make sure you have your synapse config setup in your working directory +1. ```docker build -f Dockerfile -t genie-docker .``` +1. ```docker run --rm -it -e DISABLE_SSL=true -p 4040:4040 -p 18080:18080 -v ~/.synapseConfig:/root/.synapseConfig genie-docker``` +1. Run [test code](README.md#developing-locally) relevant to the dockerfile changes to make sure changes are present and working +1. Once changes are tested, follow [genie contributing guidelines](#developing) for adding it to the repo +1. Once deployed to main, make sure docker image was successfully deployed remotely (our docker image gets automatically deployed) [here](https://hub.docker.com/repository/docker/sagebionetworks/genie/builds) + +#### Dockerhub This repository does not use github actions to push docker images. By adding the `sagebiodockerhub` github user as an Admin to this GitHub repository, we can configure an automated build in DockerHub. You can view the builds [here](https://hub.docker.com/repository/docker/sagebionetworks/genie/builds). To get admin access to the DockerHub repository, ask Sage IT to be added to the `genieadmin` DockerHub team. diff --git a/README.md b/README.md index 736da68a..bd285e4e 100644 --- a/README.md +++ b/README.md @@ -42,7 +42,6 @@ genie validate data_clinical_supp_SAGE.txt SAGE ``` - ## Contributing Please view [contributing guide](CONTRIBUTING.md) to learn how to contribute to the GENIE package. @@ -65,6 +64,18 @@ These are instructions on how you would develop and test the pipeline locally. pip install -r requirements-dev.txt ``` +If you are having trouble with the above, try installing via `pipenv` + +1. Specify a python version that is supported by this repo: + +```pipenv --python ``` + +1. [pipenv install from requirements file](https://docs.pipenv.org/en/latest/advanced.html#importing-from-requirements-txt) + +1. Activate your `pipenv`: + +```pipenv shell``` + 1. Configure the Synapse client to authenticate to Synapse. 1. Create a Synapse [Personal Access token (PAT)](https://help.synapse.org/docs/Managing-Your-Account.2055405596.html#ManagingYourAccount-PersonalAccessTokens). 1. Add a `~/.synapseConfig` file @@ -83,33 +94,40 @@ These are instructions on how you would develop and test the pipeline locally. 1. Run the different pipelines on the test project. The `--project_id syn7208886` points to the test project. - 1. Validate all the files. + 1. Validate all the files **excluding vcf files**: ``` python bin/input_to_database.py main --project_id syn7208886 --onlyValidate ``` + 1. Validate **all** the files: + + ``` + python bin/input_to_database.py mutation --project_id syn7208886 --onlyValidate --genie_annotation_pkg ../annotation-tools + ``` + 1. Process all the files aside from the mutation (maf, vcf) files. The mutation processing was split because it takes at least 2 days to process all the production mutation data. Ideally, there is a parameter to exclude or include file types to process/validate, but that is not implemented. ``` python bin/input_to_database.py main --project_id syn7208886 --deleteOld ``` - 1. Process the mutation data. Be sure to clone this repo: https://github.com/Sage-Bionetworks/annotation-tools. This repo houses the code that re-annotates the mutation data with genome nexus. The `--createNewMafDatabase` will create a new mutation tables in the test project. This flag is necessary for production data for two main reasons: + 1. Process the mutation data. Be sure to clone this repo: https://github.com/Sage-Bionetworks/annotation-tools and `git checkout` the version of the repo pinned to the [Dockerfile](https://github.com/Sage-Bionetworks/Genie/blob/main/Dockerfile). This repo houses the code that re-annotates the mutation data with genome nexus. The `--createNewMafDatabase` will create a new mutation tables in the test project. This flag is necessary for production data for two main reasons: * During processing of mutation data, the data is appended to the data, so without creating an empty table, there will be duplicated data uploaded. * By design, Synapse Tables were meant to be appended to. When a Synapse Tables is updated, it takes time to index the table and return results. This can cause problems for the pipeline when trying to query the mutation table. It is actually faster to create an entire new table than updating or deleting all rows and appending new rows when dealing with millions of rows. + * If you run this more than once on the same day, you'll run into an issue with overwriting the narrow maf table as it already exists. Be sure to rename the current narrow maf database under `Tables` in the test synapse project and try again. ``` python bin/input_to_database.py mutation --project_id syn7208886 --deleteOld --genie_annotation_pkg ../annotation-tools --createNewMafDatabase ``` - 1. Create a consortium release. Be sure to add the `--test` parameter. Be sure to clone the cbioportal repo: https://github.com/cBioPortal/cbioportal + 1. Create a consortium release. Be sure to add the `--test` parameter. Be sure to clone the cbioportal repo: https://github.com/cBioPortal/cbioportal and `git checkout` the version of the repo pinned to the [Dockerfile](https://github.com/Sage-Bionetworks/Genie/blob/main/Dockerfile) ``` python bin/database_to_staging.py Jan-2017 ../cbioportal TEST --test ``` - 1. Create a public release. Be sure to add the `--test` parameter. Be sure to clone the cbioportal repo: https://github.com/cBioPortal/cbioportal + 1. Create a public release. Be sure to add the `--test` parameter. Be sure to clone the cbioportal repo: https://github.com/cBioPortal/cbioportal and `git checkout` the version of the repo pinned to the [Dockerfile](https://github.com/Sage-Bionetworks/Genie/blob/main/Dockerfile) ``` python bin/consortium_to_public.py Jan-2017 ../cbioportal TEST --test From 4cb7cc1eaa1af2101991398e09bca5c54f5e5446 Mon Sep 17 00:00:00 2001 From: rxu17 <26471741+rxu17@users.noreply.github.com> Date: Mon, 11 Dec 2023 15:49:27 -0800 Subject: [PATCH 2/4] adjust bullet pts --- README.md | 14 ++++++-------- 1 file changed, 6 insertions(+), 8 deletions(-) diff --git a/README.md b/README.md index bd285e4e..1002e492 100644 --- a/README.md +++ b/README.md @@ -64,17 +64,15 @@ These are instructions on how you would develop and test the pipeline locally. pip install -r requirements-dev.txt ``` -If you are having trouble with the above, try installing via `pipenv` + If you are having trouble with the above, try installing via `pipenv` -1. Specify a python version that is supported by this repo: + 1. Specify a python version that is supported by this repo: + ```pipenv --python ``` -```pipenv --python ``` + 1. [pipenv install from requirements file](https://docs.pipenv.org/en/latest/advanced.html#importing-from-requirements-txt) -1. [pipenv install from requirements file](https://docs.pipenv.org/en/latest/advanced.html#importing-from-requirements-txt) - -1. Activate your `pipenv`: - -```pipenv shell``` + 1. Activate your `pipenv`: + ```pipenv shell``` 1. Configure the Synapse client to authenticate to Synapse. 1. Create a Synapse [Personal Access token (PAT)](https://help.synapse.org/docs/Managing-Your-Account.2055405596.html#ManagingYourAccount-PersonalAccessTokens). From 83aaff1f0fd484debab6a8288117f6c70a6bffd8 Mon Sep 17 00:00:00 2001 From: rxu17 <26471741+rxu17@users.noreply.github.com> Date: Mon, 11 Dec 2023 15:52:04 -0800 Subject: [PATCH 3/4] standardize dockerfile url --- CONTRIBUTING.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index a08f9740..8be9b44d 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -140,7 +140,7 @@ Follow gitflow best practices as linked above. ### Modifying Docker -Follow this section when modifying the [Dockerfile](https://github.com/Sage-Bionetworks/Genie/blob/develop/Dockerfile) +Follow this section when modifying the [Dockerfile](https://github.com/Sage-Bionetworks/Genie/blob/main/Dockerfile): 1. Make sure you have your synapse config setup in your working directory 1. ```docker build -f Dockerfile -t genie-docker .``` From de207d65aa4e1220de114716fc02f5bec16cfae6 Mon Sep 17 00:00:00 2001 From: rxu17 <26471741+rxu17@users.noreply.github.com> Date: Tue, 12 Dec 2023 10:01:09 -0800 Subject: [PATCH 4/4] adjust docker cmd with relevant params --- CONTRIBUTING.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 8be9b44d..2520bc40 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -142,9 +142,9 @@ Follow gitflow best practices as linked above. Follow this section when modifying the [Dockerfile](https://github.com/Sage-Bionetworks/Genie/blob/main/Dockerfile): -1. Make sure you have your synapse config setup in your working directory -1. ```docker build -f Dockerfile -t genie-docker .``` -1. ```docker run --rm -it -e DISABLE_SSL=true -p 4040:4040 -p 18080:18080 -v ~/.synapseConfig:/root/.synapseConfig genie-docker``` +1. Have your synapse authentication token handy +1. ```docker build -f Dockerfile -t .``` +1. ```docker run --rm -it -e SYNAPSE_AUTH_TOKEN=$YOUR_SYNAPSE_TOKEN ``` 1. Run [test code](README.md#developing-locally) relevant to the dockerfile changes to make sure changes are present and working 1. Once changes are tested, follow [genie contributing guidelines](#developing) for adding it to the repo 1. Once deployed to main, make sure docker image was successfully deployed remotely (our docker image gets automatically deployed) [here](https://hub.docker.com/repository/docker/sagebionetworks/genie/builds)