- Your workload is too big to run on your laptop
- You don't want to wait for compute time on your organization's shared, on-premise compute cluster
- You want to use Google Cloud Platform services to run large (or huge) genomic analysis jobs
- The ability to perform analysis (or compute) on files (and other types of data) at dynamic scale
- Preparation for running analysis on public cloud services hosted by GCP
-
Use the best fit GCP Account type:
- Use GCP Free Tier - get $ 300 USD in GCP credits for new users. Use GCP with minimal-sized service limits set by Google (i.e max number of CPUs for VMs...) - link to free tier
--OR--
- Use Your Organization's GCP Account - service limits set by your company & Google
-
Understand your project scope:
- the size & complexity of your analysis
- your project budget / timeline
-
Determine the best location (GCP data center):
- GCP cloud has regional data centers which are further divded into zones within the physical data center location
- global GCP data center locations shown below
NOTE: There are many bioinformatics analysis workflow tools, libraries & solutions which can be run on top of core GCP services. Examples include Terra.bio (was Firecloud), cromwell, Nextflow.io and many others.
- REQUEST an account
- a) USE a GCP account from your organization --OR--
- b) SETUP a GCP new (FREE Tier) account
- CREATE a GCP PROJECT
- USE each GCP Project as a container for each of your research projects
- CREATE separate GCP projects, as a best practice (this allows you to more easily manage security & service costs by grant)
- ADD GCP service instances to your GCP Project
- ADD services by data center location and GCP project name, for example...
- ADD a Virtual Machine instance & a Cloud Storage bucket which are located...
- in the Google datacenter in
us-east
--AND-- - in your GCP Project named
my-research-project
- in the Google datacenter in
cd ~
./google-cloud-sdk/install.sh
echo "export GOOGLE_APPLICATION_CREDENTIALS=~/stimson-d82663d96ea4.json" >> ~/.bash_profile
source ~/.bash_profile
./google-cloud-sdk/bin/gcloud components update
gcloud init
gcloud auth list
gcloud config list
open open https://cloud.google.com/docs/authentication/getting-started
open https://flaviocopes.com/google-api-authentication/#create-a-new-google-api-project
https://console.cloud.google.com/freetrial/signup/billing/US?project=ds-translation-service
open https://console.cloud.google.com/iam-admin/settings?project=ds-custom-search-service
open https://console.developers.google.com/apis/library/customsearch.googleapis.com?organizationId=0&project=ds-custom-search-service
open https://cse.google.com/cse/all
*.google.com
open https://cse.google.com/cse/setup/basic?cx=001854399042890182666:4g4vbypdey4
# GET API KEY
open https://developers.google.com/custom-search/v1/introduction
# get SEARCH ENGINE ID:
open https://cse.google.com/cse/setup/basic?cx=001854399042890182666%3A4g4vbypdey4
open https://console.cloud.google.com/iam-admin/settings?project=ds-translation-service
open https://console.developers.google.com/apis/api/translate.googleapis.com/overview?project=ds-translation-service
open https://console.developers.google.com/apis/credentials?showWizardSurvey=true&project=stimson&supportedpurview=project&pli=1
For example: ~/google-translator-########.json
echo GOOGLE_APPLICATION_CREDENTIALS=~/google-translator-########.json > ~/.bash_profile
source ~/.bash_profile
- LOGIN to GCP console or Web UI
- VIEW your account login (upper right of console)
- VIEW your project name, shown as 'gcp-for-bioinformatics' below
- GCP FREE tier account have built-in service usage limits - link
- GCP service costs are billed to YOU for FREE Tier (after you've spent your $300 credit)
- GCP service costs are billed to YOUR COMPANY (or research group) for Organizational Accounts
- SET UP a GCP Budget to get notified when GCP services exceed a service cost limit you set (this is useful for testing accounts) - link
- REVIEW GCP Billing Accounts & Resources Hierarchy (shown below)
- A billing account can be linked to one or more GCP projects and the billing account specifies how you pay (credit card, invoice...) for GCP services
- A billing account is linked to a payment profile (individual or corporate)
- 📘 Link to Enterprise Onboarding Checklist
- 📘 Link to best practices for Enterprise Organizations
- 📘 Link to how to set up a budget alert
- 📘 Link to best practices for optimizing cloud costs
- 📘 Link to GCP Billing concepts
- 📘 Link to GCP Service Pricing Calculator
- 📘 Link to Article on GCP Billing Accounts
- 📘 Link to Tips for controlling costs article written by team at The Broad
- 📘 Link to Understanding costs by analysis type - using Terra for GCP for work at The Broad
- 📺 Watch understanding Terra (GCP service) costs - 27 minute video from the Broad