Skip to content

Latest commit

 

History

History
156 lines (109 loc) · 4.99 KB

README.md

File metadata and controls

156 lines (109 loc) · 4.99 KB

Introduction to R

Moffitt Cancer Center


📆 Mondays June 7 - July 19, 2021

⏰ 3:00 pm

🏢 via Zoom (see links below)

💻 Moffitt


Setup

The course will be hosted through Zoom with video recordings posted on Ponopto after. Links to recorded lectures will be emailed out after class.

Slides

  • Introduction and R and Rmarkdown
  • Data Manipulation I and II
  • Visualization with ggplot
  • Statistical testing
  • Best practices

Code

Find our live coded notes in notes/ as we commit them.

Questions?

Feel free to ask questions in the GitHub Issues.

Overview

This is a six week course designed to introduce future users to R and Rstudio. We will cover data cleaning using the tidyverse, creating visuals with ggplot, basic statistical analysis and writing documents with Rmarkdown. In the end you should be able to:

  • import and manipulate data into different formats
  • create new variables and recode existing ones
  • plot data using the appropriate figure type
  • perform basic statistics
  • write Rmarkdown reports

Who is this course designed for?

Have you never written any code in R or any other programming language? Are you familiar with R, but hoping to bulk up your basic skills? Have you used R but are new to the tidyverse framework?

Learning objectives

Introduction to R/Rstudio and Rmarkdown

  • Get familiar with the R and Rstudio environment
  • Understand basic R code structure (packages, data structures)
  • Create and render Rmarkdown documents in multiple formats

Data import and manipulation - Part I

  • Import various data types
  • Explain the fundamentals of "tidy data" and the tidyverse
  • Clean data sets using dplyr verbs

Data manipulation and cleaning - Part II

  • Merge data sets
  • Transform data

Visualization

  • Understand the principles of ggplot2
  • Determine appropriate visualizations
  • Plot - geom_smooth to see a model over the data

Basic statistics

  • ttest, correlation and chisq
  • Model data
  • Report findings from R output

Final coding project

  • Organize projects and name objects in a uniform manner
  • Create a reproducible workflow
  • Setting up your own R installation

Materials/RStudio Cloud

Materials will be made available on GitHub. If you are using an organization-issued laptop, you may want to verify before you arrive that you can access GitHub.

In this course, we will be using a version of RStudio available online. The class workspace can be accessed here. Please create a user account before the first day of class (or connect with either a GitHub account or Google account).

Schedule and Links

Instructors

This course is taught by members of Moffitt staff including Jordan Creed, a data analyst in the Department of Health Informatics, Zachary Thompson and Ram Thapa, biostatisticians in the Bioinformatics and Biostatistics Core.