generated from jhudsl/AnVIL_Template
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy path01-quick_start.Rmd
115 lines (75 loc) · 5.61 KB
/
01-quick_start.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
# (PART\*) SRA ON AnVIL {-}
```{r, include = FALSE}
ottrpal::set_knitr_image_path()
```
# Quick Start {#quick-start}
In this module, we'll bring some metagenomic data into AnVIL.
This data comes from [this BioProject](https://www.ncbi.nlm.nih.gov/bioproject/PRJNA904247), which collected soil samples to study bacterial communities in tallgrass prairie. Bacteria play an important role in this ecosystem, but can be changed by disturbance, management, and the presence of herbivores.
The SRA Data corresponding to this project is located [here](https://www.ncbi.nlm.nih.gov/Traces/study/?acc=SRP409181&o=acc_s%3Aa).
```{r, fig.align='center', echo = FALSE, fig.alt= "Microbiome diversity has many benefitial properties ranging soil and plant health.", out.width = '100%'}
ottrpal::include_slide("https://docs.google.com/presentation/d/1l0P0gFpsPkYG7blqJ_5JyYYlztJFZDD39CnIB4svrY8/edit#slide=id.g208b8f790dc_23_217")
```
::: {.dictionary}
You might hear new terms for moving data around in the cloud. **Ingress** is when data comes to you, similar to downloading a file or receiving an email with an attachment. **Egress** is sending the data to another resource, similar to uploading or sending an attached file via email. There is no fee for ingressing data to AnVIL from SRA.
:::
## Clone Workspace
Clone the Workspace `https://anvil.terra.bio/#workspaces/anvil-outreach/SRA-data-on-AnVIL`.
For this demo, we have given the cloned Workspace the name `SRA-data-on-AnVIL-example`.
## Set Up Samples
Navigate to the WORKFLOWS Tab and select the SRA_Fetch Workflow.
```{r, fig.align='center', echo = FALSE, fig.alt= "Workflows tab with SRA_Fetch.", out.width = '100%'}
ottrpal::include_slide("https://docs.google.com/presentation/d/1l0P0gFpsPkYG7blqJ_5JyYYlztJFZDD39CnIB4svrY8/edit#slide=id.g1f25a933000_0_0")
```
Select "Run workflow(s) with inputs defined by data table".
```{r, fig.align='center', echo = FALSE, fig.alt= "'Run workflow(s) with inputs defined by data table' has been selected.", out.width = '100%'}
ottrpal::include_slide("https://docs.google.com/presentation/d/1l0P0gFpsPkYG7blqJ_5JyYYlztJFZDD39CnIB4svrY8/edit#slide=id.g1f25a933000_0_10")
```
Set the "Select root entity type" to "sample" and click SELECT DATA.
```{r, fig.align='center', echo = FALSE, fig.alt= "Step 1 and 2 for setting up the Workflow.", out.width = '100%'}
ottrpal::include_slide("https://docs.google.com/presentation/d/1l0P0gFpsPkYG7blqJ_5JyYYlztJFZDD39CnIB4svrY8/edit#slide=id.g208af248fb0_0_0")
```
On the Select Data popup, select only the first sample, `SRR22375322`, and click OK.
```{r, fig.align='center', echo = FALSE, fig.alt= "The first sample selected from the data table.", out.width = '100%'}
ottrpal::include_slide("https://docs.google.com/presentation/d/1l0P0gFpsPkYG7blqJ_5JyYYlztJFZDD39CnIB4svrY8/edit#slide=id.g208af248fb0_0_8")
```
## Launch Workflow
Click on the space underneath "Attribute" and select `this.sample_id`.
```{r, fig.align='center', echo = FALSE, fig.alt= "'this.sample_id' must be selected under the Workflow Attribute", out.width = '100%'}
ottrpal::include_slide("https://docs.google.com/presentation/d/1l0P0gFpsPkYG7blqJ_5JyYYlztJFZDD39CnIB4svrY8/edit#slide=id.g208af248fb0_0_17")
```
Click SAVE.
```{r, fig.align='center', echo = FALSE, fig.alt= "The SAVE button is highlighted", out.width = '100%'}
ottrpal::include_slide("https://docs.google.com/presentation/d/1l0P0gFpsPkYG7blqJ_5JyYYlztJFZDD39CnIB4svrY8/edit#slide=id.g208af248fb0_0_26")
```
You are ready to launch the Workflow! Click RUN ANALYSIS.
```{r, fig.align='center', echo = FALSE, fig.alt= "The RUN ANALYSIS button is highlighted", out.width = '100%'}
ottrpal::include_slide("https://docs.google.com/presentation/d/1l0P0gFpsPkYG7blqJ_5JyYYlztJFZDD39CnIB4svrY8/edit#slide=id.g208af248fb0_0_34")
```
Voilà! Your Workflow is running.
::: {.notice}
Because the Workflow is happening in the cloud, you can close your browser or shut down your computer without interrupting the transfer.
:::
```{r, fig.align='center', echo = FALSE, fig.alt= "The Workflow status page describes submission statistics and job status", out.width = '100%'}
ottrpal::include_slide("https://docs.google.com/presentation/d/1l0P0gFpsPkYG7blqJ_5JyYYlztJFZDD39CnIB4svrY8/edit#slide=id.g208af248fb0_0_42")
```
## Check Workflow
Click on the JOB HISTORY tab. You should see that the job status is "Done". This might take a few minutes.
```{r, fig.align='center', echo = FALSE, fig.alt= "The check mark indicates the Workflow has completed successfully", out.width = '100%'}
ottrpal::include_slide("https://docs.google.com/presentation/d/1l0P0gFpsPkYG7blqJ_5JyYYlztJFZDD39CnIB4svrY8/edit#slide=id.g208af248fb0_0_50")
```
## Locate Data
Click on the DATA tab and click on the "sample" table on the left.
```{r, fig.align='center', echo = FALSE, fig.alt= "Navigate to the Files folder under the DATA tab", out.width = '100%'}
ottrpal::include_slide("https://docs.google.com/presentation/d/1l0P0gFpsPkYG7blqJ_5JyYYlztJFZDD39CnIB4svrY8/edit#slide=id.g208b8f790dc_23_31")
```
You should now see the file associated with the first sample!
```{r, fig.align='center', echo = FALSE, fig.alt= "The imported file is now visible in the sample table", out.width = '100%'}
ottrpal::include_slide("https://docs.google.com/presentation/d/1l0P0gFpsPkYG7blqJ_5JyYYlztJFZDD39CnIB4svrY8/edit#slide=id.g208b8f790dc_23_41")
```
## Summary
- Clone [Workspace](https://anvil.terra.bio/#workspaces/anvil-outreach/SRA-data-on-AnVIL)
- Go to the WORKFLOWS tab
- Select sample via data table ("Run workflow(s) with inputs defined by data table")
- Set the Attribute to `this.sample_id`
- SAVE and RUN ANALYSIS
- Go to DATA tab and click "sample" table to see file populated