diff --git a/help.html b/help.html index 6beb95800..b73edd105 100644 --- a/help.html +++ b/help.html @@ -347,14 +347,14 @@

Why are my changes not taking effect? It’s making my results look

Here we are creating a new object from an existing one:

new_rivers <- sample(rivers, 5)
 new_rivers
-
## [1]  310 1100 2348  380  500
+
## [1] 360 255 600 377 720

Using just this will only print the result and not actually change new_rivers:

new_rivers + 1
-
## [1]  311 1101 2349  381  501
+
## [1] 361 256 601 378 721

If we want to modify new_rivers and save that modified version, then we need to reassign new_rivers like so:

new_rivers <- new_rivers + 1
 new_rivers
-
## [1]  311 1101 2349  381  501
+
## [1] 361 256 601 378 721

If we forget to reassign this can cause subsequent steps to not work as expected because we will not be working with the data that has been modified.


@@ -403,7 +403,7 @@

Error: object ‘X’ not found

Make sure you run something like this, with the <- operator:

rivers2 <- new_rivers + 1
 rivers2
-
## [1]  312 1102 2350  382  502
+
## [1] 362 257 602 379 722

diff --git a/modules/Esquisse_Data_Visualization/Esquisse_Data_Visualization.html b/modules/Esquisse_Data_Visualization/Esquisse_Data_Visualization.html index 6c4799b3e..0d5bdf2cd 100644 --- a/modules/Esquisse_Data_Visualization/Esquisse_Data_Visualization.html +++ b/modules/Esquisse_Data_Visualization/Esquisse_Data_Visualization.html @@ -184,9 +184,9 @@

It’s super nifty! starting a plot

-

First, get some data..

+

First, get some data..

-

We can use the CO heat-related ER visits dataset. This dataset contains information about the number and rate of visits for heat-related illness to ERs in Colorado from 2011-2022, adjusted for age.

+

We can use the CO heat-related ER visits dataset. This dataset contains information about the number and rate of visits for heat-related illness to Emergency rooms in Colorado from 2011-2022, adjusted for age.

er <-
   read_csv("https://jhudatascience.org/intro_to_r/data/CO_ER_heat_visits.csv")
@@ -299,7 +299,7 @@ 

Wide Data

-

As a comparison, let’s also load a wide version of this dataset.

+

As a comparison, let’s also load a wide version of this dataset. {.codesmall}

wide_er <- read_csv(file =
     "https://jhudatascience.org/intro_to_r/data/CO_heat_er_visits_DenverBoulder_wide.csv")
@@ -313,7 +313,7 @@

## ℹ Use `spec()` to retrieve the full column specification for this data. ## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message. -

Wide vs Long Data

+

Wide vs Long Data: Which is better for plotting?

head(long_er)
@@ -362,7 +362,12 @@

Summary

    -
  • Use the esquisser() function on a dataset
  • +
  • Use Esquisse: + +
      +
    • library(esquisse)
    • +
    • esquisser() function on a dataset
    • +
  • Use the viewer = "browser" argument to launch in your browser.
  • Code from Esquisse can copied into code chunks to be generated in the “Plots” pane
  • It’s easier if your code is in “long” form!
  • @@ -376,7 +381,7 @@

    📃 Day 6 Cheatsheet

    -

    The End

    +

    The End

    Image by Gerd Altmann from Pixabay

diff --git a/modules/Esquisse_Data_Visualization/lab/Esquisse_Data_Visualization_Lab.Rmd b/modules/Esquisse_Data_Visualization/lab/Esquisse_Data_Visualization_Lab.Rmd index bd204c939..6df18ff66 100644 --- a/modules/Esquisse_Data_Visualization/lab/Esquisse_Data_Visualization_Lab.Rmd +++ b/modules/Esquisse_Data_Visualization/lab/Esquisse_Data_Visualization_Lab.Rmd @@ -12,7 +12,7 @@ install.packages("ggplot2") ```{r, comment = FALSE} library(esquisse) -library(ggplot2) +library(tidyverse) ``` ### 1.1 diff --git a/modules/Esquisse_Data_Visualization/lab/Esquisse_Data_Visualization_Lab_Key.html b/modules/Esquisse_Data_Visualization/lab/Esquisse_Data_Visualization_Lab_Key.html index 48e6fe660..ce1cd3fe5 100644 --- a/modules/Esquisse_Data_Visualization/lab/Esquisse_Data_Visualization_Lab_Key.html +++ b/modules/Esquisse_Data_Visualization/lab/Esquisse_Data_Visualization_Lab_Key.html @@ -169,7 +169,17 @@

Esquisse Data Visualization Lab - Key

install.packages("esquisse")
 install.packages("ggplot2")
library(esquisse)
-library(ggplot2)
+library(tidyverse) +
FALSE ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
+FALSE ✔ dplyr     1.1.4     ✔ readr     2.1.5
+FALSE ✔ forcats   1.0.0     ✔ stringr   1.5.1
+FALSE ✔ ggplot2   3.5.1     ✔ tibble    3.2.1
+FALSE ✔ lubridate 1.9.3     ✔ tidyr     1.3.1
+FALSE ✔ purrr     1.0.2     
+FALSE ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
+FALSE ✖ dplyr::filter() masks stats::filter()
+FALSE ✖ dplyr::lag()    masks stats::lag()
+FALSE ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

1.1

Try creating a plot using the Orange data that automatically comes with R using the esquisse package.

@@ -186,13 +196,13 @@

1.1

geom_point(shape = "circle", size = 1.5, colour = "#112446") + theme_minimal() + facet_wrap(vars(Tree)) -

+

ggplot(Orange) +
   aes(x = age, y = circumference, colour = Tree) +
   geom_point(shape = "circle", size = 1.5) +
   scale_color_hue(direction = 1) +
   theme_minimal()
-

+

1.2

@@ -207,7 +217,7 @@

1.2

## This warning is displayed once every 8 hours. ## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was ## generated. -

+

Practice on Your Own!

diff --git a/modules/Manipulating_Data_in_R/Manipulating_Data_in_R.html b/modules/Manipulating_Data_in_R/Manipulating_Data_in_R.html index 898298076..de58eb618 100644 --- a/modules/Manipulating_Data_in_R/Manipulating_Data_in_R.html +++ b/modules/Manipulating_Data_in_R/Manipulating_Data_in_R.html @@ -200,7 +200,7 @@

-

📃Cheatsheet

+

📃Day 5 Cheatsheet

Manipulating Data

@@ -291,7 +291,7 @@

5 Alaska May_vacc_rate 0.626 6 Alaska April_vacc_rate 0.623 -

Pivoting using tidyr package

+

Pivoting using tidyr package (part of tidyverse)

tidyr allows you to “tidy” your data. We will be talking about:

@@ -339,7 +339,7 @@

C. Reshape data

-

Reshaping wide to long: Better column names

+

Reshaping wide to long: Better column names

pivot_longer() - puts column data into rows (tidyr package)

@@ -354,7 +354,7 @@

names_to = {name for old columns}, values_to = {name for cell values})
-

Reshaping wide to long: Better column names

+

Reshaping wide to long: Better column names

Newly created column names (“Month” and “Rate”) are enclosed in quotation marks. It helps us be more specific than “name” and “value”.

@@ -373,9 +373,10 @@

5 Alaska May_vacc_rate 0.626 6 Alaska April_vacc_rate 0.623 -

Data used: Charm City Circulator

+

Data used: Charm City Circulator

-
circ <- read_csv("http://jhudatascience.org/intro_to_r/data/Charm_City_Circulator_Ridership.csv")
+
circ <- 
+  read_csv("http://jhudatascience.org/intro_to_r/data/Charm_City_Circulator_Ridership.csv")
 head(circ, 5)
# A tibble: 5 × 15
@@ -428,7 +429,23 @@ 

Filter by Boardings only..

-
long <- long %>% filter(str_detect(name, "Boardings"))
+
long <- long %>% filter(str_detect(name, "Boardings"))
+long
+ +
# A tibble: 4,584 × 5
+   day       date       daily name            value
+   <chr>     <chr>      <dbl> <chr>           <dbl>
+ 1 Monday    01/11/2010  952  orangeBoardings   877
+ 2 Monday    01/11/2010  952  purpleBoardings    NA
+ 3 Monday    01/11/2010  952  greenBoardings     NA
+ 4 Monday    01/11/2010  952  bannerBoardings    NA
+ 5 Tuesday   01/12/2010  796  orangeBoardings   777
+ 6 Tuesday   01/12/2010  796  purpleBoardings    NA
+ 7 Tuesday   01/12/2010  796  greenBoardings     NA
+ 8 Tuesday   01/12/2010  796  bannerBoardings    NA
+ 9 Wednesday 01/13/2010 1212. orangeBoardings  1203
+10 Wednesday 01/13/2010 1212. purpleBoardings    NA
+# ℹ 4,574 more rows

Mission: Taking the average boardings by line

@@ -539,7 +556,7 @@

Summary

    -
  • tidyr package helps us convert between wide and long data
  • +
  • tidyr package (part of tidyverse) helps us convert between wide and long data
  • pivot_longer() goes from wide -> long
      @@ -897,13 +914,17 @@

      💻 Lab

      +

      ~

      +

      📃 Day 6 Cheatsheet

      📃 Posit’s tidyr Cheatsheet

      📃 Posit’s dplyr Cheatsheet

      -

      The End

      +

      🔎️ Joining Open Case Study

      + +

      The End

      Image by Gerd Altmann from Pixabay

      diff --git a/modules/Manipulating_Data_in_R/lab/Manipulating_Data_in_R_Lab.Rmd b/modules/Manipulating_Data_in_R/lab/Manipulating_Data_in_R_Lab.Rmd index 77b28a003..6c349a110 100644 --- a/modules/Manipulating_Data_in_R/lab/Manipulating_Data_in_R_Lab.Rmd +++ b/modules/Manipulating_Data_in_R/lab/Manipulating_Data_in_R_Lab.Rmd @@ -12,9 +12,7 @@ knitr::opts_chunk$set(echo = TRUE) Data in this lab comes from the CDC (https://covid.cdc.gov/covid-data-tracker/#vaccinations_vacc-total-admin-rate-total - snapshot from January 12, 2022) and the Bureau of Economic Analysis (https://www.bea.gov/data/income-saving/personal-income-by-state). ```{r message=FALSE} -library(readr) -library(dplyr) -library(tidyr) +library(tidyverse) ``` # Part 1 diff --git a/modules/Manipulating_Data_in_R/lab/Manipulating_Data_in_R_Lab_Key.html b/modules/Manipulating_Data_in_R/lab/Manipulating_Data_in_R_Lab_Key.html index 6d58a081c..01513f1cc 100644 --- a/modules/Manipulating_Data_in_R/lab/Manipulating_Data_in_R_Lab_Key.html +++ b/modules/Manipulating_Data_in_R/lab/Manipulating_Data_in_R_Lab_Key.html @@ -166,9 +166,7 @@

      Manipulating Data in R Lab - Key

      Data in this lab comes from the CDC (https://covid.cdc.gov/covid-data-tracker/#vaccinations_vacc-total-admin-rate-total - snapshot from January 12, 2022) and the Bureau of Economic Analysis (https://www.bea.gov/data/income-saving/personal-income-by-state).

      -
      library(readr)
      -library(dplyr)
      -library(tidyr)
      +
      library(tidyverse)

      Part 1