Skip to content

gymrek-lab/webstr

Repository files navigation

WebSTR web browser (front-end)

This repository contains code and instructions for WebSTR - web browser of Human genome-wide variation in Short Tandem Repeats (STRs). Our goal is to make large STR genotype datasets used by the broader genomics community by facilitating open access to this data.

WebSTR is the result of collaboration between two scientific groups Maria Anisimova’s Lab and Melissa Gymrek’s Lab.

Source code for the WebSTR-API can be found here: https://github.com/acg-team/webSTR-API

Contributing

If you would like to make changes to WebSTR, you must:

  1. Create a new branch off of main and make your edits
  2. Submit a pull request to merge your new branch to main
  3. The pull request must be reviewed and approved by a WebSTR developer prior to merging with main.

Instructions for setting up WebSTR for local development (without docker)

  1. Set up python3 and virtualenv on your machine: For Mac, follow instructions here.

  2. Create a new virtual env with python3 and install all the requirements with the following command: pip install -r requirements.txt

  3. Copy data files to data directory

WebSTR looks for certain files at BASEPATH. You will need the following:

  • $BASEPATH/hg19/hg19.fa (hg19 reference genome)
  • $BASEPATH/hg38/hg38.fa (hg38 reference genome)
  • $BASEPATH/dbSTR.db (legacy hg19 version of the database, can be obtained here)

The hg38 database is managed by the backend and API. This is described below, along with instructions on how to test WebSTR with a non-production version of the backend database.

  1. To run for testing and development:
git clone https://github.com/gymrek-lab/webstr
cd webstr
# optionally, checkout a specific branch to test
export BASEPATH=*full data directory path*
export FLASK_DEBUG=1 # run in debug mode
python ./WebSTR/WebSTR.py --host 0.0.0.0 --port <port>

You can then access the application at localhost:$port in your web browser.

Instructions for setting up WebSTR for local development (with docker)

  1. Clone the WebSTR repository
git clone https://github.com/gymrek-lab/webstr
cd webstr
# optionally, checkout a specific branch to test
  1. Build the debug version of the docker (requires docker to be installed)
docker build --target debug -t webstr:debug .
  1. Run the docker

You will need to mount files to the container and set the $BASEPATH variable:

docker run --mount type=bind,src=${BASEPATH},dst=/data --env BASEPATH=/data  -it --rm -t webstr:debug

You can then access the application at localhost:5000 in your web browser.

Running WebSTR in production mode with docker

For production mode, we use gunicorn + nginx.

  1. Build the docker in production mode
docker build -t webstr .
  1. Run the docker
docker run --mount type=bind,src=${BASEPATH},dst=/data --env BASEPATH=/data  -it --rm -t webstr

WebSTR Backend - database and API

WebSTR access its database through an API. By default, it uses http://webstr-api.ucsd.edu. To set a custom location for the API, for example if you are testing a non-production version of the database, you can set a different location using the WEBSTR_API_URL environment variable.

The code for the WebSTR-API backend is maintained here. If you have your own version of the database and want to test the backend locally you will need to run something like the following from inside the webSTR-API repo:

# Set the path to your local database
export DATABASE_URL="postgres://webstr:webstr@localhost:5432/strdb"

# Launch the API
uvicorn strAPI.main:app --host=0.0.0.0 --port=${PORT:-5000} --reload

Now the API should be available at localhost:5000. Before running WebSTR you can set:

export WEBSTR_API_URL=http://0.0.0.0:5000