From 8da4618fd3c3c4d0a235fbc9898b1cfedb226d16 Mon Sep 17 00:00:00 2001 From: njlyon0 Date: Mon, 22 Jan 2024 20:10:55 -0500 Subject: [PATCH] First pass at adding weekly homework information to each week's landing page --- materials/home_week2.qmd | 35 +++++++++++++++++++++++++++++++++++ materials/home_week3.qmd | 30 ++++++++++++++++++++++++++++++ materials/home_week4.qmd | 26 ++++++++++++++++++++++++++ materials/home_week5.qmd | 22 ++++++++++++++++++++++ materials/home_week6.qmd | 24 ++++++++++++++++++++++++ materials/home_week7.qmd | 19 +++++++++++++++++++ materials/home_week8.qmd | 23 +++++++++++++++++++++++ 7 files changed, 179 insertions(+) diff --git a/materials/home_week2.qmd b/materials/home_week2.qmd index 8860f64..f90a1c3 100644 --- a/materials/home_week2.qmd +++ b/materials/home_week2.qmd @@ -16,4 +16,39 @@ title: "Week 2 Slide Decks" ## Homework 2 +### Learning Objective(s) +Upon completion of these assignments, students will be able to: + +- Explain R package installation and use +- Manipulate a vector using concatenation and bracket notation +- Write conditional statements to subset a dataframe +- Define fundamental Git vocabulary +- Provide the URL for your GitHub profile + +### Assignment Due Date(s) + +Each homework is due at midnight the day before each lecture (i.e., Monday night) Late work will be accepted but will be subject to the late assignments policy outlined in this course's syllabus. + +### Assignment Description + +This homework should be submitted as an R script with your last name and week 2 as the file name (e.g., "Lyon_week2.R"). Remember to specifically load any necessary packages using the `library` function and include comments explaining what line(s) correspond to each of the following prompts. + +1. As a comment, explain what steps are needed to install a new R package from CRAN and then use functions in that package + +2. As a comment, explain (A) what is meant by "namespacing" a function and (B) what–in your opinion–the advantage(s) of namespacing are + +3. As a comment, define what is meant by an object's "class" + +4. Assign the base R constant `letters` to an object called "my_vec". Using bracket notation and concatenation, subsample 'my_vec' to spell out your surname (i.e., your last name) + +5. This is a multi-part question. Please be sure to answer every part of this question and include a comment above each answer explaining (briefly) what the line of code is doing in your own words + - Load the `palmerpenguins` library + - Assign the "penguins_raw" dataframe embedded in that library to an object called "peng_df2" + - Check the structure of 'peng_df2'. How many rows are there? How many columns are there? As a comment, specify how you figured out the number of rows / columns + - Subset 'peng_df2' to only penguins that were found on the island named "Torgersen". How many rows are in that subset (i.e., how many penguins were found on Torgersen Island)? + - Using as many subsets as needed, identify how many penguins had their sex recorded. I.e., for how many rows of 'peng_df2' was the "Sex" column either "MALE" or "FEMALE"? As a comment, explain your thought process for how you figured this out. + +6. As a comment, define what a "pull" means in the context of Git / GitHub + +7. As a comment, provide the link to your GitHub profile (e.g., "github.com/njlyon0") diff --git a/materials/home_week3.qmd b/materials/home_week3.qmd index ae9005b..a3c0086 100644 --- a/materials/home_week3.qmd +++ b/materials/home_week3.qmd @@ -16,4 +16,34 @@ title: "Week 3 Slide Decks" ## Homework 3 +### Learning Objective(s) +Upon completion of this assignment, students will be able to: + +- Explain fundamental aspects of R Markdown files +- Use data wrangling functions from the `dplyr` and `tidyr` packages +- Rearrange wrangling code using the pipe operator (`%>%`) + +### Assignment Due Date(s) + +Each homework is due at midnight the day before each lecture (i.e., Monday night) Late work will be accepted but will be subject to the late assignments policy outlined in this course’s syllabus. + +### Assignment Description + +This homework should be submitted as an R markdown with your last name and the week number as the file name (e.g., "Lyon_week3.Rmd"). Remember to specifically load any necessary packages using the `library` function and include comments explaining what line(s) correspond to each of the following prompts. + +1. In your own words, explain the purpose of the YAML of an R Markdown file. What does it do / what is it for? + +2. Imagine you’re working on an R Markdown file. You set one code chunk to `include = FALSE`. What will happen to the contents of that code chunk when you knit the file? What parts–if any–of the code chunk will be displayed in the knit file? + +3. You are studying native bee populations and need to wrangle your dataset to be ready for someone else to take on analysis. This question has multiple parts, be sure to answer all components! Each sub-question should be in its own code chunk. + - Load the `dplyr` and `magrittr` packages and read in the example bee data ("bees.csv") into R and check its structure. How many rows / columns are there? + - Your advisor reminds you that 2021 was a tough year for field research and those data are likely not reliable. Use the filter function to remove all data on bees identified in 2021. + - You also realize that your methods are not well-suited to capturing kleptoparasitic bee abundance so those values are not reliable enough to pass on to analysis. Use the `select` function to remove this column from the subset you just created. + - You feel that the data you have now are clean enough to continue. However, your collaborator wants a column for the total bee abundance in each year. Use the `mutate` function to create the column your collaborator has requested from the data version created by the above step. + - Your collaborator loves the final data product you created! But they looked at your code and they think it can be streamlined. Copy the code you just wrote (`filter`, `select`, and `mutate`) and use the pipe operator (`%>%`) to write a version that does all three steps without creating intermediary objects. Check the structure of the final object. + +4. Your data wrangling skills are so impressive to your collaborator that they ask for your help with a lichen project they have been working on. This question has multiple parts, be sure to answer all components! Each sub-question should be in its own code chunk. + - Load the `tidyr` package and read in the example lichen data ("tree_lichen.csv") into R and check its structure? What columns are included? + - Your collaborator tells you that they have already cleaned the data but the data shape isn’t exactly what they need. They want your help reshaping the data into long format. Use the `pivot_longer` function to reshape their data so that they have the following three columns: "tree" unchanged from the wide format version; "lichen_type" which includes whether the lichen is crustose, foliose, or fruticose; "percent_cover" which includes the percent cover values your collaborator measured in the field. + - Check the structure of the long format object you created. How many rows are there? diff --git a/materials/home_week4.qmd b/materials/home_week4.qmd index 833f080..e7b97d0 100644 --- a/materials/home_week4.qmd +++ b/materials/home_week4.qmd @@ -16,4 +16,30 @@ title: "Week 4 Slide Decks" ## Homework 4 +### Learning Objective(s) +Upon completion of this assignment, students will be able to: + +- Judge the appropriate join to use depending on the question being asked +- Distinguish between continuous and discrete data +- Identify the correct statistical test given response and explanatory variable types +- Perform analysis in R correctly + +### Assignment Due Date(s) + +Each homework is due at midnight the day before each lecture (i.e., Monday night) Late work will be accepted but will be subject to the late assignments policy outlined in this course’s syllabus. + +### Assignment Description + +This homework should be submitted as an R Markdown with your last name and the week number as the file name (e.g., “Lyon_week2.Rmd”). Remember to specifically load any necessary packages using the `library` function and include comments explaining what line(s) correspond to each of the following prompts. + +1. A friend asks for your help wrangling some data for a summer project they did on tomatoes. Your friend tells you the "tomato.csv" file includes data on the number of buds, flowers, and fruits from 10 tomato plants they measured in a greenhouse. These plants either received no treatment or had Nitrogen added to their soil. Treatment information is listed in "tomato_treatment.csv". Finally, some plants had some issues that your friend recorded in a third file just in case they needed to account for that in the statistics later on. This question has multiple parts, be sure to answer all components! Each sub-question should be in its own code chunk. + - Load the `dplyr` package and read in the three data files they shared with you: (1) "tomato.csv", "tomato_treatment.csv", and "tomato_issues.csv". Check the structure of all three files. + - Your friend first wants you to attach the treatment information to the main data file. Do this with a join (by plant) so that no rows are lost and check the structure of the resulting object. + - Your friend has decided they do want to drop some of the plants that had issues after all. They want to remove the problem plants if they had either herbivore damage or a fungal infection (they are okay including over-watered plants in the final data). Using `filter` and an appropriate `join` of your choice, remove the plants your friend wants excluded from the data object you created in the previous part of this question. Check the structure of this object. + +2. Let's revisit the penguin data from the `palmerpenguins` package to practice some statistics! This question has multiple parts, be sure to answer all components! Each sub-question should be in its own code chunk. + - Load the `palmerpenguins` package and check the structure of the `penguins` data object. + - For all columns, identify whether each column is "discrete" (aka "categorical") or "continuous". + - Now that you've identified the data type of each column, let's do some statistics! Your hypothesis is that flipper length differs between male and female penguins. Fit the correct test and run `summary` on the model fit object. Was your hypothesis supported? How do you know? (Remember you can check the “roadmap” we covered in class!) + - You hypothesize that penguins with a higher body mass have longer flippers. Fit the correct test and run `summary` on the model fit object. Was your hypothesis supported? How do you know? diff --git a/materials/home_week5.qmd b/materials/home_week5.qmd index 47d9628..aadf9f9 100644 --- a/materials/home_week5.qmd +++ b/materials/home_week5.qmd @@ -16,4 +16,26 @@ title: "Week 5 Slide Decks" ## Homework 5 +### Learning Objective(s) +Upon completion of this assignment, students will be able to: + +- Identify the most appropriate statistical test for a given hypothesis +- Use multi-model inference tools in R + +### Assignment Due Date(s) + +Each homework is due at midnight the day before each lecture (i.e., Monday night) Late work will be accepted but will be subject to the late assignments policy outlined in this course's syllabus. + +### Assignment Description + +This homework should be submitted as an R Markdown with your last name and the week number as the file name (e.g., "Lyon_week6.Rmd"). Remember to specifically load any necessary packages using the `library` function and include comments explaining what line(s) correspond to each of the following prompts. + +1. You hypothesize that penguin flipper length differs between sexes and that the degree of difference between sexes is related to the species of penguin (i.e., sex and species are interacting). Test your hypothesis using the "penguins" dataset in the `palmerpenguins` package + - What is the name of the most appropriate statistical test? Why is it the correct test for this question? + - In a code chunk, fit the model you have chosen and generate a summary table of the relevant information. Remember to remove any penguins that don't have a recorded sex! + - Based on the summary table you just generated, is your hypothesis supported? What information in the summary table allows you to draw this conclusion? + +2. You present on your work at a professional society's annual meeting and people are really excited about this finding! A colleague approaches you afterwards and asks you whether you've used multi-model inference to check to see if including the home island of the penguin improves your model's explanatory power. You realize this is a great idea and decide to pursue it immediately + - Fit a model with sex and species interacting and add a term for island (no interaction term). Fit another model where sex, species, and island are all included as explanatory variables and they are all allowed to interact. + - Using the `AIC` function, compare these two new models with your original model. Which model has the most explanatory power? How do you know? diff --git a/materials/home_week6.qmd b/materials/home_week6.qmd index a58a32d..3b9aaf1 100644 --- a/materials/home_week6.qmd +++ b/materials/home_week6.qmd @@ -16,4 +16,28 @@ title: "Week 6 Slide Decks" ## Homework 6 +### Learning Objective(s) +Upon completion of this assignment, students will be able to: + +- Create publication-quality graphs using the `ggplot2` package +- Customize background theme elements of those graphs +- Make multi-panel graphs both using `ggplot2` and using the `cowplot` package + +### Assignment Due Date(s) + +Each homework is due at midnight the day before each lecture (i.e., Monday night) Late work will be accepted but will be subject to the late assignments policy outlined in this course’s syllabus. + +### Assignment Description + +This homework should be submitted as an R Markdown with your last name and the week number as the file name (e.g., "Lyon_week6.Rmd"). Remember to specifically load any necessary packages using the `library` function and include comments explaining what line(s) correspond to each of the following prompts. + +1. Recalling your exciting statistical discoveries on penguin flipper length relationship to sex, species, and island, you decide to make some publication quality figures to demonstrate your findings. Use the `ggplot2` package to make the graphs and the `palmerpenguins` package for the "penguins" data. + - Begin by creating a violin plot with sex on the x axis and flipper length on the y axis. The violin blobs’ fill color should differ between the two sexes (default colors are fine!). + - That graph was a fine first pass but you realize you’ve neglected to include penguin species information in the graph! Facet the graph to divide the plot into three panels (either rows or columns are fine)--one panel per species of penguin. Also, remove the gray background color (but keep the black axis lines!) and add custom axis titles that remove the underscores ("_") in the raw column names. + +2. Remembering your colleague’s suggestion of evaluating the effect of island you decide to make a second graph where you visualize penguin flipper length at the three islands. + - Create a violin plot where island is on the y axis and flipper length is on the x axis. Make the fill color of each violin different between islands. Remove the gray background color, keep black axis lines, and customize the axis titles so this graph visually matches your first graph. + - After making this graph you realize you are accidentally re-using the default `ggplot2` red and blue colors. This could be very confusing to your audience so you definitely need to address it. Visit the Coolors website (coolors.co/palettes/trending) and pick a custom color for each of the three islands. Manually set the violin fill to give each island one of these three colors. + +3. Your two graphs are looking great! Now you just need to combine them into one file for ease of sharing to your peers. Use the `cowplot` package to put both graphs (the species by sex graph and the island graph) into one plot. These can be either side-by-side or one on top of the other depending on your preference. diff --git a/materials/home_week7.qmd b/materials/home_week7.qmd index 2f63a8c..31e1ee0 100644 --- a/materials/home_week7.qmd +++ b/materials/home_week7.qmd @@ -16,5 +16,24 @@ This space was held empty to allow for in-class presentations of the functional ## Homework 7 +### Learning Objective(s) +Upon completion of this assignment, students will be able to: +- Write a for loop to perform a repeated operation +- Modify a for loop to handle multiple conditionals +- Create informative messages within a loop using the `print` and `paste` functions together + +### Assignment Due Date(s) + +Each homework is due at midnight the day before each lecture (i.e., Monday night) Late work will be accepted but will be subject to the late assignments policy outlined in this course’s syllabus. + +### Assignment Description + +This homework should be submitted as an R Markdown with your last name and the week number as the file name (e.g., "Lyon_week6.Rmd"). Remember to specifically load any necessary packages using the `library` function and include comments explaining what line(s) correspond to each of the following prompts. + +1. At an international conference you meet a French colleague also interested in penguin research. They’re very excited after your recent string of findings (and beautiful graphs) and want to plan a visit to the states to visit several researchers in your field. However, they intuitively think in Celsius (C) and would love your help with an R script that converts Fahrenheit (F) to their preferred temperature units + - Create an object containing the number 35. Using that object, convert it to Celsius and print that converted value. The conversion formula is as follows: C = (F - 32) x 5/9 + - Your colleague is very appreciative of your quick calculation but realizes it might be easier to convert a set of temperatures from C to F so that they can keep that range in mind. Write a for loop that converts 35, 45, 55, 65, 75, 85, and 95°F into their equivalents in C and prints the result + - You’re happy with this loop but you don’t think the raw temperatures are going to be very useful to your colleague’s planning. Add a conditional to the loop where if the temperature in Celsius is less than or equal to 18 a message is printed that includes (1) the temp in Fahrenheit and (2) a note to your colleague that they should plan on packing a jacket. If the temperature is greater than 18°C, the message should still include the temp in F and can tell them not to worry about bringing a jacket + - Whether or not to bring a jacket is a rather coarse assessment of temperature though! Add more conditionals such that (A) if temperature is less than 18°C the message includes the note about bringing a jacket, (B) if temperature is greater than or equal to 23°C the message is that they should pack shorts, or (C) if temperature is between 18 and 23°C that neither shorts nor a jacket is required. All three message options should also include the temperature in Fahrenheit for that iteration of the loop (same as question 1c) diff --git a/materials/home_week8.qmd b/materials/home_week8.qmd index a5fb1ef..c91ba8c 100644 --- a/materials/home_week8.qmd +++ b/materials/home_week8.qmd @@ -22,4 +22,27 @@ I may (and likely will) design new 'bonus content' modules and I'll list those b ## Homework 8 +### Learning Objective(s) + +Upon completion of this assignment, students will be able to: + +- Create a custom function in R +- Demonstrate your process for writing a new function + +### Assignment Due Date(s) + +Each homework is due at midnight the day before each lecture (i.e., Monday night) Late work will be accepted but will be subject to the late assignments policy outlined in this course’s syllabus. + +### Assignment Description + +This homework should be submitted as an R Markdown with your last name and the week number as the file name (e.g., "Lyon_week6.Rmd"). Remember to specifically load any necessary packages using the `library` function and include comments explaining what line(s) correspond to each of the following prompts. + +1. Your French colleague (for whom you calculated a number of temperature conversions last week) reaches out to you again with one final request: they would like you to write a function for them in R that converts temperature from Celsius to Fahrenheit so that they can stop bothering you to do the conversion for them. Note that last week you converted F to C and this week is the opposite! + - In a code chunk, convert 22°C into Fahrenheit. Feel free to Google the formula. Also, note that you may want to be careful to use parentheses to make sure the order of operations is correct. + - Identify which part of that formula should be replaced with an object (i.e., what part of the formula will become an argument). In another code chunk, copy your conversion formula and replace the relevant part(s) with objects. + - Use R’s function syntax to make your scripted formula into a working function. Define the function (i.e., assign it to an object) in a code chunk. + - Use your new custom function to convert 5°C and 35°C into their equivalents in Fahrenheit in a final code chunk. + + +