Skip to content

Latest commit

 

History

History
346 lines (255 loc) · 10.9 KB

lecture.org

File metadata and controls

346 lines (255 loc) · 10.9 KB

Note

To view the material below as a presentation, open lecture.html.

Lecture 12 – Visualize data using R / ggplot2

Arvind R. Subramaniam

Assistant Member

Basic Sciences Division and Computational Biology Program

Fred Hutchinson Cancer Research Center

Contents

What you will learn over the next 3 lectures

Loading, Transforming, Visualizing Tabular Data using Tidyverse packages

Principles of Data Visualization (see book)

Example Datasets

Plate Reader Assay

img/plate_reader.jpg

Flow Cytometry

img/flow_cytometer.jpg

Raw Flow Cytometry Data

FSC.ASSC.AFITC.APE.Texas.Red.ATime
79033693389173186903.02
1013368757413184298863.04
51737561613083183243.06
79904450859957180993.08
1244919730515739287303.09
54359450156175119183.11
646158898911907324133.13
1095926413212561188243.15
5850311638411591276293.19
38634515117200219303.21

5 cols × 2,720,000 rows

Flow Cytometry Analysis Using Tidyverse

img/example_flow_cytometry_analysis.png

Tidyverse Functions for Working with Tabular Data

Import/ExportVisualizeTransform
read_tsvgeom_pointselect
write_tsvgeom_linefilter
facet_gridarrange
mutate
join
group_by
summarize

Use TSV and CSV file formats for tabular data

Tab-Separated Values:

strain   mean_yfp  mean_rfp  mean_ratio  se_ratio  insert_sequence  kozak_region 
schp674      1270     20316       0.561     0.004  10×AAG           CAAA         
schp675      3687     20438       1.621     0.036  10×AAG           CCGC         
schp676      2657     20223       1.177     0.048  10×AAG           CCAA         
schp677      3967     20604       1.728      0.03  10×AAG           CCAC         

Comma-Separated Values:

strain,mean_yfp,mean_rfp,mean_ratio,se_ratio,insert_sequence,kozak_region
schp674,1270,20316,0.561,0.004,10×AAG,CAAA
schp675,3687,20438,1.621,0.036,10×AAG,CCGC
schp676,2657,20223,1.177,0.048,10×AAG,CCAA
schp677,3967,20604,1.728,0.03,10×AAG,CCAC

Reading tabular data into R

library(tidyverse)

data <- read_tsv("data/example_dataset_1.tsv")

Read tabular data into a DataFrame (tibble)

library(tidyverse)

data <- read_tsv("data/example_dataset_1.tsv")

print(data, n = 5)

Comment your code

# library to work with tabular data
library(tidyverse)

# read the tsv file into a tibble and 
# assign it to the 'data' variable
data <- read_tsv("data/example_dataset_1.tsv")

# display the contents of 'data' 
print(data, n = 5)

Plotting a point graph

ggplot(data, aes(x = kozak_region,
                 y = mean_ratio)) +
  geom_point()

img/ggplot2_point_example_no_color.png

How do we show multiple experimental parameters?

strainmean_ratioinsert_sequencekozak_region
schp6880.75510×AGAA
schp6841.43710×AGAB
schp6901.54110×AGAC
schp6872.00410×AGAD
schp6862.12110×AGAE
schp6852.89310×AGAF
schp6833.52210×AGAG
schp6893.42410×AGAH
schp6791.14910×AAGA
schp6751.62110×AAGB
schp6811.64510×AAGC
schp6781.90610×AAGD
schp6771.72810×AAGE
schp6761.17710×AAGF
schp6740.56110×AAGG
schp6800.51910×AAGH

Plotting a point graph with color

ggplot(data, aes(x = kozak_region,
                 y = mean_ratio,
                 color = insert_sequence)) +
  geom_point()
  

img/ggplot2_point_example.png

Plotting a line graph

ggplot(data, aes(x = kozak_region,
                 y = mean_ratio,
                 color = insert_sequence,
                 group = insert_sequence)) +
  geom_line()

img/ggplot2_line_example.png

Plotting point and line graphs

ggplot(data, aes(x = kozak_region,
                 y = mean_ratio,
                 color = insert_sequence,
                 group = insert_sequence)) +
  geom_line() +
  geom_point()

img/ggplot2_line_point_example.png

‘Faceting’ – Plotting in multiple panels

ggplot(data, aes(x = kozak_region,
                 y = mean_ratio,
                 group = insert_sequence)) +
  geom_line() +
  geom_point() +
  facet_grid(~ insert_sequence)

img/ggplot2_line_point_facet_example.png

Play time!

  • Get used to the RStudio interface.
  • Plot data and customize appearance.
  • Learn how to “Knit” RMarkdown files.
  • Learn more at https://ggplot2.tidyverse.org.