Install and initilize Google Cloud CLI
- How to install google clould: instruction
- How to initilize google cloud CLI here
- Use a different google cloud account (not the AoU email one, should be the ucsd email) to loggin the google cloud and creat a project or select a exist project
- Switch between multiple account and projects: the current project information is
PROJECT_ID:
. * For multiple projects or account, need to create separate configuration for each project/account, for details typegcloud topic configurations
. * To create a new configurations, usinggcloud init
(tested, don't know how to change config name) orgcloud config configurations create <my-config>
. * To activate a configuration, usinggcloud config configurations activate <my-config>
; to display the path of the activate configuration rungcloud info --format="get(config.paths.active_config_path)
. * To view current activate configuration usegcloud config list
; to view all configurations usinggcloud config configurations list
. - Other common use command
* Parameters of configuration file can change using `gcloud config set`.
* List available accounts: `gcloud auth list`
* Switch the active account: `gcloud config set account <account-email>`
* List available projects: `gcloud config list project`
* Switch to project `gcloud config set project <project-id>`
Set up GCR (based on this tutorial
- Enable API (may only need to first time use):
- Use Google Cloud Console
- Use
gcloud
command:gcloud services enable containerregistry.googleapis.com
- To disable API: go to this link, select the project, click Manage, then click Disable API.
- Commands for set up the gcr
gcloud auth login gcloud config set project PROJECT_ID gcloud auth configure-docker docker tag <image-name> <gcr-path> docker tag yli091230/hipstr:amd64 gcr.io/ucsd-medicine-cast/hipstr:amd64\n docker push gcr.io/ucsd-medicine-cast/hipstr:amd64
- Role and permissions:
- Recommend to use a service account.
- The first push requires Storage Admin role to create a storage bucket for the registry.
- After the initial image push:
- Stoc
- cohort builder --> concept set selector --> Dataset builder --> Jupyter notebook
- Cohort builder:
- Create review set
- Dataset Builder:
- Cohorts:Participants, Concept set (for each sample):Rows, Values:Columns
- Jupyter notebook build directly
- Docker images must be stored in GCR.
- Example to push docker image
busybox
togcr
:my-project
is theproject ID
.\docker pull busybox docker tag busybox gcr.io/my-project/busybox docker push gcr.io/my-project/busybox
- The user need permission to pull and push images.
Check dsub
- Google cloud bucket and local (workspace bucket)
- Need to check ggogle cloud bucket and how it works in AoU platform
- Can we output all of the files to the fix/permenant bucket?
- Do we need to left the notebook run during wdl
NOTE, for All of Us project, the WORKSPACE_BUCKET
can not access through local terminal.
For transfer multiple large files to instance, enable the Parallel composite uploads in the cromwell configuration file. Example file:
backend {
...
providers {
...
PapiV2 {
actor-factory = "cromwell.backend.google.pipelines.v2beta.PipelinesApiLifecycleActorFactory"
config {
...
genomics {
...
parallel-composite-upload-threshold = 150M
...
}
...
}
}
}
}
List of Google pipelines API workflow options.
- Run with preemptible instance This is to run job on a preemptible instance for 1 time, if premptied, then use on-demand device.
options_filename = "options.json"
options_content = f'{{\n "jes_gcs_root": "{output_bucket}",\n "default_runtime_attributes": {{\n "preemptible": "1"\n }}\n}}'
fp = open(options_filename, 'w')
fp.write(options_content)
fp.close()
- Output directory
{
"final_workflow_outputs_dir": "/Users/michael_scott/cromwell/outputs",
"use_relative_output_paths": true,
"final_workflow_log_dir": "/Users/michael_scott/cromwell/wf_logs",
"final_call_logs_dir": "/Users/michael_scott/cromwell/call_logs"
}
- How to ssh into VM locally
- Parallel transfer file?
- Using Parallel Composite Uploads
- how to use the
enable_fuse
in cromwell for google cloud
- How to custom configuration files