fhdsl · caalo · Aug 27, 2024 · Aug 27, 2024 · Aug 27, 2024 · Aug 27, 2024
diff --git a/01-intro-to-computing.Rmd b/01-intro-to-computing.Rmd
@@ -195,7 +195,9 @@ Some types, such as ints, are able to use a more efficient algorithm when
 invoked using the three argument form.
 ```
 
-This shows the function takes in three input arguments: `base`, `exp`, and `mod=None`. When an argument has an assigned value of `mod=None`, that means the input argument already has a value, and you don't need to specify anything, unless you want to.
+We can also find a similar help document, in a [nicer rendered form online.](https://docs.python.org/3/library/functions.html#pow) We will practice looking at function documentation throughout the course, because that is a fundamental skill to learn more functions on your own.
+
+The documentation shows the function takes in three input arguments: `base`, `exp`, and `mod=None`. When an argument has an assigned value of `mod=None`, that means the input argument already has a value, and you don't need to specify anything, unless you want to.
 
 The following ways are equivalent ways of using the `pow()` function:
 
@@ -219,11 +221,11 @@ And there is an operational equivalent:
 
 We will mostly look at functions with input arguments and return types in this course, but not all functions need to have input arguments and output return. Let's look at some examples of functions that don't always have an input or output:
 
-| Function call | What it takes in         | What it does                                                  | Returns |
-|---------------|---------------|----------------------------|---------------|
-| `pow(a, b)`   | integer `a`, integer `b` | Raises `a` to the `b`th power.                                | Integer |
-| `print(x)`    | any data type `x`        | Prints out the value of `x` to the console.                   | None    |
-| `dir()`       | Nothing                  | Gives a list of all the variables defined in the environment. | List    |
+| Function call                                                        | What it takes in         | What it does                                                  | Returns |
+|----------------|----------------|-------------------------|----------------|
+| [`pow(a, b)`](https://docs.python.org/3/library/functions.html#pow)  | integer `a`, integer `b` | Raises `a` to the `b`th power.                                | Integer |
+| [`print(x)`](https://docs.python.org/3/library/functions.html#print) | any data type `x`        | Prints out the value of `x` to the console.                   | None    |
+| [`dir()`](https://docs.python.org/3/library/functions.html#dir)      | Nothing                  | Gives a list of all the variables defined in the environment. | List    |
 
 ## Tips on writing your first code
 

diff --git a/02-data-structures.Rmd b/02-data-structures.Rmd
@@ -105,20 +105,20 @@ Object methods are functions that does something with the object you are using i
 
 Here are some more examples of methods with lists:
 
-| Function method    | What it takes in             | What it does                                                          | Returns                          |
-|----------------|----------------|-------------------------------------|------------------|
-| `chrNum.count(x)`  | list `chrNum`, data type `x` | Counts the number of instances `x` appears as an element of `chrNum`. | Integer                          |
-| `chrNum.append(x)` | list `chrNum`, data type `x` | Appends `x` to the end of the `chrNum`.                               | None (but `chrNum` is modified!) |
-| `chrNum.sort()`    | list `chrNum`                | Sorts `chrNum` by ascending order.                                    | None (but `chrNum` is modified!) |
-| `chrNum.reverse()` | list `chrNum`                | Reverses the order of `chrNum`.                                       | None (but `chrNum` is modified!) |
+| Function method                                                              | What it takes in             | What it does                                                          | Returns                          |
+|---------------|---------------|---------------------------|---------------|
+| [`chrNum.count(x)`](https://docs.python.org/3/tutorial/datastructures.html)  | list `chrNum`, data type `x` | Counts the number of instances `x` appears as an element of `chrNum`. | Integer                          |
+| [`chrNum.append(x)`](https://docs.python.org/3/tutorial/datastructures.html) | list `chrNum`, data type `x` | Appends `x` to the end of the `chrNum`.                               | None (but `chrNum` is modified!) |
+| [`chrNum.sort()`](https://docs.python.org/3/tutorial/datastructures.html)    | list `chrNum`                | Sorts `chrNum` by ascending order.                                    | None (but `chrNum` is modified!) |
+| [`chrNum.reverse()`](https://docs.python.org/3/tutorial/datastructures.html) | list `chrNum`                | Reverses the order of `chrNum`.                                       | None (but `chrNum` is modified!) |
 
 ## Dataframes
 
 A Dataframe is a two-dimensional data structure that stores data like a spreadsheet does.
 
 The Dataframe data structure is found within a Python module called "Pandas". A Python module is an organized collection of functions and data structures. The `import` statement below gives us permission to access the "Pandas" module via the variable `pd`.
 
-To load in a Dataframe from existing spreadsheet data, we use the function `pd.read_csv()`:
+To load in a Dataframe from existing spreadsheet data, we use the function [`pd.read_csv()`](https://pandas.pydata.org/docs/reference/api/pandas.read_csv.html):
 
 ```{python}
 import pandas as pd
@@ -127,7 +127,7 @@ metadata = pd.read_csv("classroom_data/metadata.csv")
 type(metadata)
 ```
 
-There is a similar function `pd.read_excel()` for loading in Excel spreadsheets.
+There is a similar function [`pd.read_excel()`](https://pandas.pydata.org/docs/reference/api/pandas.read_excel.html) for loading in Excel spreadsheets.
 
 Let's investigate the Dataframe as an object:
 
@@ -166,7 +166,7 @@ metadata.shape
 
 ### What can a Dataframe do (in terms of operations and functions)?
 
-We can use the `head()` and `tail()` functions to look at the first few rows and last few rows of `metadata`, respectively:
+We can use the [`.head()`](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.head.html) and [`.tail()`](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.tail.html) methods to look at the first few rows and last few rows of `metadata`, respectively:
 
 ```{python}
 metadata.head()
@@ -179,7 +179,7 @@ Both of these functions (without input arguments) are considered as **methods**:
 
 Perhaps the most important operation you will can do with Dataframes is subsetting them. There are two ways to do it. The first way is to subset by numerical indicies, exactly like how we did for lists.
 
-You will use the `iloc` and bracket operations, and you give two slices: one for the row, and one for the column.
+You will use the [`iloc`](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.iloc.html) and bracket operations, and you give two slices: one for the row, and one for the column.
 
 Let's start with a small dataframe to see how it works before returning to `metadata`:
 

diff --git a/03-data-wrangling1.Rmd b/03-data-wrangling1.Rmd
@@ -65,7 +65,7 @@ expression.head()
 ```
 
 | Dataframe  | The observation is | Some variables are            | Some values are             |
-|------------------|------------------|-------------------|------------------|
+|-----------------|-----------------|--------------------|------------------|
 | metadata   | Cell line          | ModelID, Age, OncotreeLineage | "ACH-000001", 60, "Myeloid" |
 | expression | Cell line          | KRAS_Exp                      | 2.4, .3                     |
 | mutation   | Cell line          | KRAS_Mut                      | TRUE, FALSE                 |
@@ -94,7 +94,7 @@ To subset for rows implicitly, we will use the conditional operators on Datafram
 metadata['OncotreeLineage'] == "Lung"
 ```
 
-Then, we will use the `.loc` operation (which is different than `.iloc` operation!) and subsetting brackets to subset rows and columns Age and Sex at the same time:
+Then, we will use the [`.loc`](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.loc.html) operation (which is different than [`.iloc`](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.iloc.html) operation!) and subsetting brackets to subset rows and columns Age and Sex at the same time:
 
 ```{python}
 metadata.loc[metadata['OncotreeLineage'] == "Lung", ["Age", "Sex"]]
@@ -126,12 +126,12 @@ Now that your Dataframe has be transformed based on your scientific question, yo
 
 If we look at the data structure of a Dataframe's column, it is actually not a List, but an object called Series. It has methods can compute summary statistics for us. Let's take a look at a few popular examples:
 
-| Function method                           | What it takes in                              | What it does                                                                  | Returns       |
-|----------------|----------------|-------------------------|----------------|
-| `metadata.Age.mean()`                     | `metadata.Age` as a numeric Series            | Computes the mean value of the `Age` column.                                  | Float (NumPy) |
-| `metadata['Age'].median()`                | `metadata['Age']` as a numeric Series         | Computes the median value of the `Age` column.                                | Float (NumPy) |
-| `metadata.Age.max()`                      | `metadata.Age` as a numeric Series            | Computes the max value of the `Age` column.                                   | Float (NumPy) |
-| `metadata.OncotreeSubtype.value_counts()` | `metadata.OncotreeSubtype` as a string Series | Creates a frequency table of all unique elements in `OncotreeSubtype` column. | Series        |
+| Function method                                                                                                           | What it takes in                              | What it does                                                                  | Returns       |
+|----------------|----------------|------------------------|----------------|
+| [`metadata.Age.mean()`](https://pandas.pydata.org/docs/reference/api/pandas.Series.mean.html)                             | `metadata.Age` as a numeric Series            | Computes the mean value of the `Age` column.                                  | Float (NumPy) |
+| [`metadata['Age'].median()`](https://pandas.pydata.org/docs/reference/api/pandas.Series.median.html)                      | `metadata['Age']` as a numeric Series         | Computes the median value of the `Age` column.                                | Float (NumPy) |
+| [`metadata.Age.max()`](https://pandas.pydata.org/docs/reference/api/pandas.Series.max.html)                               | `metadata.Age` as a numeric Series            | Computes the max value of the `Age` column.                                   | Float (NumPy) |
+| [`metadata.OncotreeSubtype.value_counts()`](https://pandas.pydata.org/docs/reference/api/pandas.Series.value_counts.html) | `metadata.OncotreeSubtype` as a string Series | Creates a frequency table of all unique elements in `OncotreeSubtype` column. | Series        |
 
 Let's try it out, with some nice print formatting:
 
@@ -144,10 +144,10 @@ Notice that the output of some of these methods are Float (NumPy). This refers t
 
 ## Simple data visualization
 
-We will dedicate extensive time later this course to talk about data visualization, but the Dataframe's column, Series, has a method called `.plot()` that can help us make simple plots for one variable. The `.plot()` method will by default make a line plot, but it is not necessary the plot style we want, so we can give the optional argument `kind` a String value to specify the plot style. We use it for making a histogram or bar plot.
+We will dedicate extensive time later this course to talk about data visualization, but the Dataframe's column, Series, has a method called [`.plot()`](https://pandas.pydata.org/docs/reference/api/pandas.Series.plot.html) that can help us make simple plots for one variable. The `.plot()` method will by default make a line plot, but it is not necessary the plot style we want, so we can give the optional argument `kind` a String value to specify the plot style. We use it for making a histogram or bar plot.
 
 | Plot style | Useful for | kind = | Code                                                         |
-|-----------|-----------|-----------|--------------------------------------|
+|-------------|-------------|-------------|---------------------------------|
 | Histogram  | Numerics   | "hist" | `metadata.Age.plot(kind = "hist")`                           |
 | Bar plot   | Strings    | "bar"  | `metadata.OncotreeSubtype.value_counts().plot(kind = "bar")` |