generated from jhudsl/AnVIL_Template
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy path03-custom_samples.Rmd
68 lines (43 loc) · 3.69 KB
/
03-custom_samples.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
# Customize Samples
You will probably need to select different samples than the ones in this demo.
We've created another workspace, `SRA-data-on-AnVIL-example2`, to demonstrate how to upload your own sample IDs.
If you go to the DATA tab, you'll notice the same samples (ending in 22-26). These are here because data tables are copied when you clone a workspace. However, let's add a second set of samples ending in 27-31.
```{r, fig.align='center', echo = FALSE, fig.alt= "The 'sample' data table has been cloned from the original Workspace, including the sample IDs", out.width = '100%'}
ottrpal::include_slide("https://docs.google.com/presentation/d/1l0P0gFpsPkYG7blqJ_5JyYYlztJFZDD39CnIB4svrY8/edit#slide=id.g208b8f790dc_23_101")
```
## Import Data
Click on IMPORT DATA and select "Upload TSV".
```{r, fig.align='center', echo = FALSE, fig.alt= "The IMPORT DATA button and 'Upload TSV' option", out.width = '100%'}
ottrpal::include_slide("https://docs.google.com/presentation/d/1l0P0gFpsPkYG7blqJ_5JyYYlztJFZDD39CnIB4svrY8/edit#slide=id.g208b8f790dc_23_144")
```
This opens a popup that looks like this:
```{r, fig.align='center', echo = FALSE, fig.alt= "The popup is titled Import Data Table and has the option to click to select a .tsv file", out.width = '100%'}
ottrpal::include_slide("https://docs.google.com/presentation/d/1l0P0gFpsPkYG7blqJ_5JyYYlztJFZDD39CnIB4svrY8/edit#slide=id.g208b8f790dc_23_153")
```
However, let's take a moment to get acquainted with the new file we'll be uploading.
## The `samples.tsv` File
First, download the samples file here. You might have to right-click and "Save as".
[Download `samples.tsv`](https://raw.githubusercontent.com/fhdsl/AnVIL_SRA_Data/main/samples.tsv)
Next, open the file on your local machine. This is what it might look like in Microsoft Excel:
```{r, fig.align='center', echo = FALSE, fig.alt= "The samples we want to import from SRA are listed in rows in `samples.tsv`", out.width = '100%'}
ottrpal::include_slide("https://docs.google.com/presentation/d/1l0P0gFpsPkYG7blqJ_5JyYYlztJFZDD39CnIB4svrY8/edit#slide=id.g208b8f790dc_23_163")
```
::: {.notice}
The column header `entity:sample_id` is important. `entity:` is required. `samples` becomes the name of the data table. So for example, if our header was `entity:reference_id`, a data table called "reference" would be created in AnVIL. If you didn't want to overwrite anything in the original "samples" table, you could change the column header. As long as none of the IDs are the same, no data will be overwritten.
:::
## Upload the TSV
Back on AnVIL, Click to select a TSV file. This file should be the one you just downloaded above called `samples.tsv`. You will see a warning about potentially overwriting the existing entries. We know that none of the IDs in the new samples file overlap, so click START IMPORT JOB.
```{r, fig.align='center', echo = FALSE, fig.alt= "The warning is now visible on the popup and the START IMPORT JOB button is highlighted", out.width = '100%'}
ottrpal::include_slide("https://docs.google.com/presentation/d/1l0P0gFpsPkYG7blqJ_5JyYYlztJFZDD39CnIB4svrY8/edit#slide=id.g208b8f790dc_23_171")
```
New samples have been added!
```{r, fig.align='center', echo = FALSE, fig.alt= "The new samples have been appended to the end of the samples data table", out.width = '100%'}
ottrpal::include_slide("https://docs.google.com/presentation/d/1l0P0gFpsPkYG7blqJ_5JyYYlztJFZDD39CnIB4svrY8/edit#slide=id.g208b8f790dc_23_179")
```
::: {.notice}
You can now proceed with running the Workflow as you did in the [Quick Start](#quick-start) and [Multiple Files](#multiple-sra-files) sections.
:::
## Summary
- Go to the DATA tab
- Select IMPORT DATA and "Upload TSV"
- Add your custom file and click START IMPORT JOB