Skip to content

Commit

Permalink
change_exercises_and_precourse
Browse files Browse the repository at this point in the history
  • Loading branch information
jeitziner committed Dec 19, 2023
1 parent 75f0aee commit 93adac3
Show file tree
Hide file tree
Showing 4 changed files with 107 additions and 11 deletions.
Binary file modified docs/assets/exercises/Exercises_IS.zip
Binary file not shown.
98 changes: 98 additions & 0 deletions docs/assets/exercises/easy_R_script.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,98 @@
## Starting point in R

## R is a calculator

2*3
log(4)
exp(3)
pi - 3

## R can store variables

x <- 2

# equivalent to

x = 2

# then we can calculate
1/x
# and store it as a variable
y <- 1/x

# a vector is stored with the command c() separated by commas

x <- c(1,10,3,0.4)

# a vector with NA values

x2 <- c(NA,10,NA,0.4)

# one can use a vector variable to create a new one

y <- c(x,x,1,3)

# or do operation with variables
y <- 2*x

# generating number
z <- 1:10
z <- c(1,2,3,4,5,6,7,8,9,10)

## lets see our first functions in R

# To access the help of a function use help() or ?

?mean
mean(z)
mean(x2) ## will show NA as there are NA values
mean(x2,na.rm=T) ## na.rm is an argument that is boolean, i.e. either T or F
## explaning what to do with the NAs in the data


?seq
x <- seq(from=1,to=5,length=17)

#if we want to know which values of x are smaller than 4 we can use "<" which
#is the boolean operator

x < 4
sum(x<4)

# subselecting some vector entries

x[1]
x[2:5]
x[c(2,5,8)]

## working directory
getwd()

## always set your right environment
setwd()

## open files

read.delim()
read.csv()
read.table()

#if you have a matrix
data <- matrix(c(1,2,3,5,6,7),nrow=2)

# you can access the row 2 and the column 3
data[2,3]

# you can also have dataframes (mix of matrix with different types for the columns)

x <- 1:10
y <- seq(from=5,to=10,length=10)
z <- c("A","B","B","A","A","A","B","A","B","B")
df <- data.frame(d1=x, d2=y, fact=z)

summary(df)
df$fact
df$fact == "B"
!(df$fact == "B") ## changed all TRUE to FALSE
df[ df$fact == "B", "d2" ]

11 changes: 0 additions & 11 deletions docs/exam.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,14 +30,3 @@ c) Get the summary statistics of "IS_23_exam".
6. Test your hypothesis using the tests and modeling techniques from the course, based on the type of variables you have. Include tests of the
assumptions where appropriate.


## PCA and clustering

1. Perform a PCA using all the variables in the dataset, discarding the age and gender
2. Do a PCA plot, using different colors for the data points for males and females.
3. How much variance is encoded by each principal component ?
4. Which variables have the strongest influence on each of the first two principal components ?
5. Create a new dataframe called PCA_coord with the coordinates of the data points on PC1 and PC2
6. Evaluate the Euclidean distance between the data points
7. Generate a heatmap of the distance matrix
8. Identify clusters of the data points using a method of your choice, that has been shown during the course
9 changes: 9 additions & 0 deletions docs/precourse.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,3 +14,12 @@ Participants do not need any experience in R before the course
To do the exercises, you are required to have your own computer with at least 4 Gb of RAM and with an internet connection, as well as the latest the version of [R](https://cran.r-project.org/)
and the free version of [RStudio](https://www.rstudio.com/products/rstudio/download/) installed.

### Installation in R

To ensure that the course runs smoothly check if you are **allowed to install packages into R**. Some companies block users to download packages for security reasons and an error saying : contact your administrator might appear. Make sure to discuss with the IT service of your company if you can, and if not contact us in order to be able to follow the course easily.

It might also happen that your antivirus blocks R from downloading packages.

In order to check if all runs smoothly, try to download your first package from R Studio, you can go to the menu Tools -> Install packages?, and then choose the package you need installed (choose for example "ISwR"). Using the RGui under Windows, you can go to menu Packages -> Install package(s). In the console, you can use the install.packages command: install.packages("ISwR") for example. Once installed you can load the library, which should then not give you an error: library("ISwR").

If there is any problem, contact us.

0 comments on commit 93adac3

Please sign in to comment.