Skip to content

Commit

Permalink
Render course
Browse files Browse the repository at this point in the history
  • Loading branch information
github-actions[bot] committed Jan 30, 2025
1 parent 71f5f27 commit f258085
Show file tree
Hide file tree
Showing 23 changed files with 334 additions and 315 deletions.
24 changes: 12 additions & 12 deletions docs/01-quick_start.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ This data comes from [this BioProject](https://www.ncbi.nlm.nih.gov/bioproject/P

The SRA Data corresponding to this project is located [here](https://www.ncbi.nlm.nih.gov/Traces/study/?acc=SRP409181&o=acc_s%3Aa).

<img src="resources/images/01-quick_start_files/figure-html//1l0P0gFpsPkYG7blqJ_5JyYYlztJFZDD39CnIB4svrY8_g208b8f790dc_23_217.png" title="Microbiome diversity has many benefitial properties ranging soil and plant health." alt="Microbiome diversity has many benefitial properties ranging soil and plant health." width="100%" style="display: block; margin: auto;" />
<img src="resources/images/01-quick_start_files/figure-html//1l0P0gFpsPkYG7blqJ_5JyYYlztJFZDD39CnIB4svrY8_g208b8f790dc_23_217.png" alt="Microbiome diversity has many benefitial properties ranging soil and plant health." width="100%" style="display: block; margin: auto;" />

::: {.dictionary}
You might hear new terms for moving data around in the cloud. **Ingress** is when data comes to you, similar to downloading a file or receiving an email with an attachment. **Egress** is sending the data to another resource, similar to uploading or sending an attached file via email. There is no fee for ingressing data to AnVIL from SRA.
Expand All @@ -27,57 +27,57 @@ For this demo, we have given the cloned Workspace the name `SRA-data-on-AnVIL-ex

Navigate to the WORKFLOWS Tab and select the SRA_Fetch Workflow.

<img src="resources/images/01-quick_start_files/figure-html//1l0P0gFpsPkYG7blqJ_5JyYYlztJFZDD39CnIB4svrY8_g1f25a933000_0_0.png" title="Workflows tab with SRA_Fetch." alt="Workflows tab with SRA_Fetch." width="100%" style="display: block; margin: auto;" />
<img src="resources/images/01-quick_start_files/figure-html//1l0P0gFpsPkYG7blqJ_5JyYYlztJFZDD39CnIB4svrY8_g1f25a933000_0_0.png" alt="Workflows tab with SRA_Fetch." width="100%" style="display: block; margin: auto;" />

Select "Run workflow(s) with inputs defined by data table".

<img src="resources/images/01-quick_start_files/figure-html//1l0P0gFpsPkYG7blqJ_5JyYYlztJFZDD39CnIB4svrY8_g1f25a933000_0_10.png" title="'Run workflow(s) with inputs defined by data table' has been selected." alt="'Run workflow(s) with inputs defined by data table' has been selected." width="100%" style="display: block; margin: auto;" />
<img src="resources/images/01-quick_start_files/figure-html//1l0P0gFpsPkYG7blqJ_5JyYYlztJFZDD39CnIB4svrY8_g1f25a933000_0_10.png" alt="'Run workflow(s) with inputs defined by data table' has been selected." width="100%" style="display: block; margin: auto;" />

Set the "Select root entity type" to "sample" and click SELECT DATA.

<img src="resources/images/01-quick_start_files/figure-html//1l0P0gFpsPkYG7blqJ_5JyYYlztJFZDD39CnIB4svrY8_g208af248fb0_0_0.png" title="Step 1 and 2 for setting up the Workflow." alt="Step 1 and 2 for setting up the Workflow." width="100%" style="display: block; margin: auto;" />
<img src="resources/images/01-quick_start_files/figure-html//1l0P0gFpsPkYG7blqJ_5JyYYlztJFZDD39CnIB4svrY8_g208af248fb0_0_0.png" alt="Step 1 and 2 for setting up the Workflow." width="100%" style="display: block; margin: auto;" />

On the Select Data popup, select only the first sample, `SRR22375322`, and click OK.

<img src="resources/images/01-quick_start_files/figure-html//1l0P0gFpsPkYG7blqJ_5JyYYlztJFZDD39CnIB4svrY8_g208af248fb0_0_8.png" title="The first sample selected from the data table." alt="The first sample selected from the data table." width="100%" style="display: block; margin: auto;" />
<img src="resources/images/01-quick_start_files/figure-html//1l0P0gFpsPkYG7blqJ_5JyYYlztJFZDD39CnIB4svrY8_g208af248fb0_0_8.png" alt="The first sample selected from the data table." width="100%" style="display: block; margin: auto;" />

## Launch Workflow

Click on the space underneath "Attribute" and select `this.sample_id`.

<img src="resources/images/01-quick_start_files/figure-html//1l0P0gFpsPkYG7blqJ_5JyYYlztJFZDD39CnIB4svrY8_g208af248fb0_0_17.png" title="'this.sample_id' must be selected under the Workflow Attribute" alt="'this.sample_id' must be selected under the Workflow Attribute" width="100%" style="display: block; margin: auto;" />
<img src="resources/images/01-quick_start_files/figure-html//1l0P0gFpsPkYG7blqJ_5JyYYlztJFZDD39CnIB4svrY8_g208af248fb0_0_17.png" alt="'this.sample_id' must be selected under the Workflow Attribute" width="100%" style="display: block; margin: auto;" />

Click SAVE.

<img src="resources/images/01-quick_start_files/figure-html//1l0P0gFpsPkYG7blqJ_5JyYYlztJFZDD39CnIB4svrY8_g208af248fb0_0_26.png" title="The SAVE button is highlighted" alt="The SAVE button is highlighted" width="100%" style="display: block; margin: auto;" />
<img src="resources/images/01-quick_start_files/figure-html//1l0P0gFpsPkYG7blqJ_5JyYYlztJFZDD39CnIB4svrY8_g208af248fb0_0_26.png" alt="The SAVE button is highlighted" width="100%" style="display: block; margin: auto;" />

You are ready to launch the Workflow! Click RUN ANALYSIS.

<img src="resources/images/01-quick_start_files/figure-html//1l0P0gFpsPkYG7blqJ_5JyYYlztJFZDD39CnIB4svrY8_g208af248fb0_0_34.png" title="The RUN ANALYSIS button is highlighted" alt="The RUN ANALYSIS button is highlighted" width="100%" style="display: block; margin: auto;" />
<img src="resources/images/01-quick_start_files/figure-html//1l0P0gFpsPkYG7blqJ_5JyYYlztJFZDD39CnIB4svrY8_g208af248fb0_0_34.png" alt="The RUN ANALYSIS button is highlighted" width="100%" style="display: block; margin: auto;" />

Voilà! Your Workflow is running.

::: {.notice}
Because the Workflow is happening in the cloud, you can close your browser or shut down your computer without interrupting the transfer.
:::

<img src="resources/images/01-quick_start_files/figure-html//1l0P0gFpsPkYG7blqJ_5JyYYlztJFZDD39CnIB4svrY8_g208af248fb0_0_42.png" title="The Workflow status page describes submission statistics and job status" alt="The Workflow status page describes submission statistics and job status" width="100%" style="display: block; margin: auto;" />
<img src="resources/images/01-quick_start_files/figure-html//1l0P0gFpsPkYG7blqJ_5JyYYlztJFZDD39CnIB4svrY8_g208af248fb0_0_42.png" alt="The Workflow status page describes submission statistics and job status" width="100%" style="display: block; margin: auto;" />

## Check Workflow

Click on the JOB HISTORY tab. You should see that the job status is "Done". This might take a few minutes.

<img src="resources/images/01-quick_start_files/figure-html//1l0P0gFpsPkYG7blqJ_5JyYYlztJFZDD39CnIB4svrY8_g208af248fb0_0_50.png" title="The check mark indicates the Workflow has completed successfully" alt="The check mark indicates the Workflow has completed successfully" width="100%" style="display: block; margin: auto;" />
<img src="resources/images/01-quick_start_files/figure-html//1l0P0gFpsPkYG7blqJ_5JyYYlztJFZDD39CnIB4svrY8_g208af248fb0_0_50.png" alt="The check mark indicates the Workflow has completed successfully" width="100%" style="display: block; margin: auto;" />

## Locate Data

Click on the DATA tab and click on the "sample" table on the left.

<img src="resources/images/01-quick_start_files/figure-html//1l0P0gFpsPkYG7blqJ_5JyYYlztJFZDD39CnIB4svrY8_g208b8f790dc_23_31.png" title="Navigate to the Files folder under the DATA tab" alt="Navigate to the Files folder under the DATA tab" width="100%" style="display: block; margin: auto;" />
<img src="resources/images/01-quick_start_files/figure-html//1l0P0gFpsPkYG7blqJ_5JyYYlztJFZDD39CnIB4svrY8_g208b8f790dc_23_31.png" alt="Navigate to the Files folder under the DATA tab" width="100%" style="display: block; margin: auto;" />

You should now see the file associated with the first sample!

<img src="resources/images/01-quick_start_files/figure-html//1l0P0gFpsPkYG7blqJ_5JyYYlztJFZDD39CnIB4svrY8_g208b8f790dc_23_41.png" title="The imported file is now visible in the sample table" alt="The imported file is now visible in the sample table" width="100%" style="display: block; margin: auto;" />
<img src="resources/images/01-quick_start_files/figure-html//1l0P0gFpsPkYG7blqJ_5JyYYlztJFZDD39CnIB4svrY8_g208b8f790dc_23_41.png" alt="The imported file is now visible in the sample table" width="100%" style="display: block; margin: auto;" />

## Summary

Expand Down
18 changes: 9 additions & 9 deletions docs/02-multi_file.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,27 +7,27 @@ More than likely, you will be importing multiple files from SRA. Luckily, this i

Navigate to the WORKFLOWS Tab and select the SRA_Fetch Workflow.

<img src="02-multi_file_files/figure-html//1l0P0gFpsPkYG7blqJ_5JyYYlztJFZDD39CnIB4svrY8_g1f25a933000_0_0.png" title="Workflows tab with SRA_Fetch." alt="Workflows tab with SRA_Fetch." width="100%" style="display: block; margin: auto;" />
<img src="02-multi_file_files/figure-html//1l0P0gFpsPkYG7blqJ_5JyYYlztJFZDD39CnIB4svrY8_g1f25a933000_0_0.png" alt="Workflows tab with SRA_Fetch." width="100%" style="display: block; margin: auto;" />

Select "Run workflow(s) with inputs defined by data table".

<img src="02-multi_file_files/figure-html//1l0P0gFpsPkYG7blqJ_5JyYYlztJFZDD39CnIB4svrY8_g1f25a933000_0_10.png" title="'Run workflow(s) with inputs defined by data table' has been selected." alt="'Run workflow(s) with inputs defined by data table' has been selected." width="100%" style="display: block; margin: auto;" />
<img src="02-multi_file_files/figure-html//1l0P0gFpsPkYG7blqJ_5JyYYlztJFZDD39CnIB4svrY8_g1f25a933000_0_10.png" alt="'Run workflow(s) with inputs defined by data table' has been selected." width="100%" style="display: block; margin: auto;" />

Set the "Select root entity type" to "sample" and click SELECT DATA.

<img src="02-multi_file_files/figure-html//1l0P0gFpsPkYG7blqJ_5JyYYlztJFZDD39CnIB4svrY8_g208af248fb0_0_0.png" title="Step 1 and 2 for setting up the Workflow." alt="Step 1 and 2 for setting up the Workflow." width="100%" style="display: block; margin: auto;" />
<img src="02-multi_file_files/figure-html//1l0P0gFpsPkYG7blqJ_5JyYYlztJFZDD39CnIB4svrY8_g208af248fb0_0_0.png" alt="Step 1 and 2 for setting up the Workflow." width="100%" style="display: block; margin: auto;" />

Select the second through fifth samples and click OK on the bottom right.

<img src="02-multi_file_files/figure-html//1l0P0gFpsPkYG7blqJ_5JyYYlztJFZDD39CnIB4svrY8_g208b8f790dc_23_54.png" title="Select multiple files from the sample table" alt="Select multiple files from the sample table" width="100%" style="display: block; margin: auto;" />
<img src="02-multi_file_files/figure-html//1l0P0gFpsPkYG7blqJ_5JyYYlztJFZDD39CnIB4svrY8_g208b8f790dc_23_54.png" alt="Select multiple files from the sample table" width="100%" style="display: block; margin: auto;" />

Ensure the "Attribute" is set to `this.sample_id` and click RUN ANALYSIS.

<img src="02-multi_file_files/figure-html//1l0P0gFpsPkYG7blqJ_5JyYYlztJFZDD39CnIB4svrY8_g208b8f790dc_23_64.png" title="Confirm `this.sample_id` and click the RUN ANALYSIS button" alt="Confirm `this.sample_id` and click the RUN ANALYSIS button" width="100%" style="display: block; margin: auto;" />
<img src="02-multi_file_files/figure-html//1l0P0gFpsPkYG7blqJ_5JyYYlztJFZDD39CnIB4svrY8_g208b8f790dc_23_64.png" alt="Confirm `this.sample_id` and click the RUN ANALYSIS button" width="100%" style="display: block; margin: auto;" />

Click LAUNCH. You can close your browser or shut down your computer without interrupting the transfer.

<img src="02-multi_file_files/figure-html//1l0P0gFpsPkYG7blqJ_5JyYYlztJFZDD39CnIB4svrY8_g208b8f790dc_23_73.png" title="Click the LAUNCH button; the 4 analyses being run is called out" alt="Click the LAUNCH button; the 4 analyses being run is called out" width="100%" style="display: block; margin: auto;" />
<img src="02-multi_file_files/figure-html//1l0P0gFpsPkYG7blqJ_5JyYYlztJFZDD39CnIB4svrY8_g208b8f790dc_23_73.png" alt="Click the LAUNCH button; the 4 analyses being run is called out" width="100%" style="display: block; margin: auto;" />

::: {.notice}
The Workflow knows that you probably want to parallelize the import of your SRA files. This means that each import is happening at the same time. Notice how this workflow with multiple samples actually launched 4 different jobs/analyses! This means that AnVIL can help you process lots of files much faster than working with them one by one.
Expand All @@ -37,18 +37,18 @@ The Workflow knows that you probably want to parallelize the import of your SRA

Click on the JOB HISTORY tab. Different submissions are arranged by newest on the top. You should see that the job status is "Done".

<img src="02-multi_file_files/figure-html//1l0P0gFpsPkYG7blqJ_5JyYYlztJFZDD39CnIB4svrY8_g208b8f790dc_23_83.png" title="An arrow pointing to 'Done' indicates the Workflow has completed successfully" alt="An arrow pointing to 'Done' indicates the Workflow has completed successfully" width="100%" style="display: block; margin: auto;" />
<img src="02-multi_file_files/figure-html//1l0P0gFpsPkYG7blqJ_5JyYYlztJFZDD39CnIB4svrY8_g208b8f790dc_23_83.png" alt="An arrow pointing to 'Done' indicates the Workflow has completed successfully" width="100%" style="display: block; margin: auto;" />


## Locate Data

Click on the DATA tab and click on the "sample" table on the left.

<img src="02-multi_file_files/figure-html//1l0P0gFpsPkYG7blqJ_5JyYYlztJFZDD39CnIB4svrY8_g208b8f790dc_23_31.png" title="Navigate to the Files folder under the DATA tab" alt="Navigate to the Files folder under the DATA tab" width="100%" style="display: block; margin: auto;" />
<img src="02-multi_file_files/figure-html//1l0P0gFpsPkYG7blqJ_5JyYYlztJFZDD39CnIB4svrY8_g208b8f790dc_23_31.png" alt="Navigate to the Files folder under the DATA tab" width="100%" style="display: block; margin: auto;" />

You should now see the files associated with the second through fifth sample!

<img src="02-multi_file_files/figure-html//1l0P0gFpsPkYG7blqJ_5JyYYlztJFZDD39CnIB4svrY8_g208b8f790dc_23_92.png" title="The imported files are now visible in the sample table" alt="The imported files are now visible in the sample table" width="100%" style="display: block; margin: auto;" />
<img src="02-multi_file_files/figure-html//1l0P0gFpsPkYG7blqJ_5JyYYlztJFZDD39CnIB4svrY8_g208b8f790dc_23_92.png" alt="The imported files are now visible in the sample table" width="100%" style="display: block; margin: auto;" />

## Summary

Expand Down
12 changes: 6 additions & 6 deletions docs/03-custom_samples.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,17 +7,17 @@ We've created another workspace, `SRA-data-on-AnVIL-example2`, to demonstrate ho

If you go to the DATA tab, you'll notice the same samples (ending in 22-26). These are here because data tables are copied when you clone a workspace. However, let's add a second set of samples ending in 27-31.

<img src="03-custom_samples_files/figure-html//1l0P0gFpsPkYG7blqJ_5JyYYlztJFZDD39CnIB4svrY8_g208b8f790dc_23_101.png" title="The 'sample' data table has been cloned from the original Workspace, including the sample IDs" alt="The 'sample' data table has been cloned from the original Workspace, including the sample IDs" width="100%" style="display: block; margin: auto;" />
<img src="03-custom_samples_files/figure-html//1l0P0gFpsPkYG7blqJ_5JyYYlztJFZDD39CnIB4svrY8_g208b8f790dc_23_101.png" alt="The 'sample' data table has been cloned from the original Workspace, including the sample IDs" width="100%" style="display: block; margin: auto;" />

## Import Data

Click on IMPORT DATA and select "Upload TSV".

<img src="03-custom_samples_files/figure-html//1l0P0gFpsPkYG7blqJ_5JyYYlztJFZDD39CnIB4svrY8_g208b8f790dc_23_144.png" title="The IMPORT DATA button and 'Upload TSV' option" alt="The IMPORT DATA button and 'Upload TSV' option" width="100%" style="display: block; margin: auto;" />
<img src="03-custom_samples_files/figure-html//1l0P0gFpsPkYG7blqJ_5JyYYlztJFZDD39CnIB4svrY8_g208b8f790dc_23_144.png" alt="The IMPORT DATA button and 'Upload TSV' option" width="100%" style="display: block; margin: auto;" />

This opens a popup that looks like this:

<img src="03-custom_samples_files/figure-html//1l0P0gFpsPkYG7blqJ_5JyYYlztJFZDD39CnIB4svrY8_g208b8f790dc_23_153.png" title="The popup is titled Import Data Table and has the option to click to select a .tsv file" alt="The popup is titled Import Data Table and has the option to click to select a .tsv file" width="100%" style="display: block; margin: auto;" />
<img src="03-custom_samples_files/figure-html//1l0P0gFpsPkYG7blqJ_5JyYYlztJFZDD39CnIB4svrY8_g208b8f790dc_23_153.png" alt="The popup is titled Import Data Table and has the option to click to select a .tsv file" width="100%" style="display: block; margin: auto;" />

However, let's take a moment to get acquainted with the new file we'll be uploading.

Expand All @@ -29,7 +29,7 @@ First, download the samples file here. You might have to right-click and "Save a

Next, open the file on your local machine. This is what it might look like in Microsoft Excel:

<img src="03-custom_samples_files/figure-html//1l0P0gFpsPkYG7blqJ_5JyYYlztJFZDD39CnIB4svrY8_g208b8f790dc_23_163.png" title="The samples we want to import from SRA are listed in rows in `samples.tsv`" alt="The samples we want to import from SRA are listed in rows in `samples.tsv`" width="100%" style="display: block; margin: auto;" />
<img src="03-custom_samples_files/figure-html//1l0P0gFpsPkYG7blqJ_5JyYYlztJFZDD39CnIB4svrY8_g208b8f790dc_23_163.png" alt="The samples we want to import from SRA are listed in rows in `samples.tsv`" width="100%" style="display: block; margin: auto;" />

::: {.notice}
The column header `entity:sample_id` is important. `entity:` is required. `samples` becomes the name of the data table. So for example, if our header was `entity:reference_id`, a data table called "reference" would be created in AnVIL. If you didn't want to overwrite anything in the original "samples" table, you could change the column header. As long as none of the IDs are the same, no data will be overwritten.
Expand All @@ -39,11 +39,11 @@ The column header `entity:sample_id` is important. `entity:` is required. `sampl

Back on AnVIL, Click to select a TSV file. This file should be the one you just downloaded above called `samples.tsv`. You will see a warning about potentially overwriting the existing entries. We know that none of the IDs in the new samples file overlap, so click START IMPORT JOB.

<img src="03-custom_samples_files/figure-html//1l0P0gFpsPkYG7blqJ_5JyYYlztJFZDD39CnIB4svrY8_g208b8f790dc_23_171.png" title="The warning is now visible on the popup and the START IMPORT JOB button is highlighted" alt="The warning is now visible on the popup and the START IMPORT JOB button is highlighted" width="100%" style="display: block; margin: auto;" />
<img src="03-custom_samples_files/figure-html//1l0P0gFpsPkYG7blqJ_5JyYYlztJFZDD39CnIB4svrY8_g208b8f790dc_23_171.png" alt="The warning is now visible on the popup and the START IMPORT JOB button is highlighted" width="100%" style="display: block; margin: auto;" />

New samples have been added!

<img src="03-custom_samples_files/figure-html//1l0P0gFpsPkYG7blqJ_5JyYYlztJFZDD39CnIB4svrY8_g208b8f790dc_23_179.png" title="The new samples have been appended to the end of the samples data table" alt="The new samples have been appended to the end of the samples data table" width="100%" style="display: block; margin: auto;" />
<img src="03-custom_samples_files/figure-html//1l0P0gFpsPkYG7blqJ_5JyYYlztJFZDD39CnIB4svrY8_g208b8f790dc_23_179.png" alt="The new samples have been appended to the end of the samples data table" width="100%" style="display: block; margin: auto;" />

::: {.notice}
You can now proceed with running the Workflow as you did in the [Quick Start](#quick-start) and [Multiple Files](#multiple-sra-files) sections.
Expand Down
Loading

0 comments on commit f258085

Please sign in to comment.