Skip to content

Commit

Permalink
Incorporate Ben's feedback (#6)
Browse files Browse the repository at this point in the history
* Fix setup

* Add feedback

* Start to address todos

* Additional changes and add readme outline

* Test on aws machine

---------

Co-authored-by: Benjamin Gallusser <[email protected]>
  • Loading branch information
msschwartz21 and bentaculum authored Aug 20, 2023
1 parent faad95b commit 904eecf
Show file tree
Hide file tree
Showing 5 changed files with 800 additions and 3,039 deletions.
31 changes: 29 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ Make sure that you are inside of the `intro_ml` folder by using the `cd` command

Run the setup script to create the environment for this exercise and download the dataset.
```bash
sh setup.sh
source setup.sh
```

Launch a jupyter environment
Expand All @@ -15,4 +15,31 @@ Launch a jupyter environment
jupyter lab
```

...and continue with the instructions in the notebook.
...and continue with the instructions in the notebook `exercise.ipynb`.

## Exercise

### Part A: The Linear Classifier
We will implement a basic linear classifier from scratch and train it to predict the cell cycle on a flow cytometry dataset.

You will learn
- How to prepare a dataset for training, including
- Checking for class imbalance (Task 1.1)
- Correcting class imbalance (Task 1.2)
- Converting categorical data to one-hot encoding (Task 1.3)
- The basic math behind a linear classifier
- How to evaluate model performance (Tasks 2.1 - 2.3)

### Part B: Random Forest Classifier
We will learn about Random Forest Classifiers and use `scikit-learn` to train one on our dataset.

You will learn
- How to use `scikit-learn`'s model objects
- How to perform a hyperparameter search to optimize model performance (Tasks 3.1 - 4.1)

### Part C: Feature Engineering
We will explore image filters and see if they can improve the performance of either our linear or random forest classifer.

You will learn
- How to use `scikit-image`'s filter modules (Task 5.1)
- How to reuse what we have done so far to build a new dataset on filtered images and train your own model on the new dataset (Task 5.2)
Loading

0 comments on commit 904eecf

Please sign in to comment.