Update exercise for 2024 with feedback form 2023 #15

afoix · 2024-07-25T09:18:41Z

This pull request includes:

Update in the README.md including the format suggested in this issue . This solve issue Updated the readme.md following the suggestions #12 .
Add feedback suggested in 2023: work together with the KE task. This solves issue Improvement suggestions from 2023 #8 .
Add feedback suggested in 2023: Add an exercises combining the two dataset sets at the end. This solves issue Improvement suggestions from 2023 #8
We tested to include Colored MNIST but I did not manage to find a narrative that flows so we kept the standard MNIST.

…(shuffle off, then on)

cmalinmayor · 2024-08-18T14:51:37Z

@afoix How long did it take to train the denoising model for you? On my M3 mac (no GPU, but usually quite speedy with pytorch) its over half an hour. We should test on the actual hardware, because if it is that slow we need to make it faster.

cmalinmayor

I still want to test the denoising part on the hardware to see how long it takes. I don't remember it being slow last year? Perhaps @adjavon has more memory on this than I do. @adjavon also curious to get your opinion on the new denoising tasks - at the very least we should explain them a bit more.

cmalinmayor · 2024-08-18T20:44:13Z

solution.py

+
+# ### Train the denoiser on both MNIST and FashionMNIST
+#
+# In this section, we will perform the denoiser training once again, but this time on both MNIST and FashionMNIST datasets, and then try to apply the newly trained denoiser to a set of noisy test images.


I think we need to think a bit more on what the point of this part is. Do we actually want them to train sequentially and then shuffled? I think it would be fine to just train shuffled and show that it works and is the correct way to get good (ish?) performance on both. If we want to include a sequential example, I think it would make more sense as essentially a fine-tuning procedure, where you fully train on MNIST and then FashionMNIST, which would show how fine-tuned networks forget their initial task. The way it is implemented now, both datasets are included in both epochs so it is not really sequential or shuffled, and I find it confusing what exactly we are trying to teach them. Also, if the denoiser takes longer than 10 minutes to train, we should definitely skip sequential.

Hey @cmalinmayor the denoiser definitely takes less to train at least in my machine with a GPU 😄 However, I am happy to skip the sequential part if you both prefer it to be removed. Let me know! :)

I added a comment below, but long story short on the AWS hardware it is just fast enough to not be too annoying. I like having both because I like the lesson I learn from it (networks forget). I can have a go at making that a bit more on-the-nose in the text during my afternoon run typos-and-things PR :)

Suggested Changes

afoix · 2024-08-19T00:11:18Z

@afoix How long did it take to train the denoising model for you? On my M3 mac (no GPU, but usually quite speedy with pytorch) its over half an hour. We should test on the actual hardware, because if it is that slow we need to make it faster.

Hello @cmalinmayor in my machine, using a GPU each epoch is only 25 seconds, and it's 5 epochs so it's actually quite fast 😄

adjavon · 2024-08-19T15:53:33Z

I still want to test the denoising part on the hardware to see how long it takes. I don't remember it being slow last year? Perhaps @adjavon has more memory on this than I do. @adjavon also curious to get your opinion on the new denoising tasks - at the very least we should explain them a bit more.

I'm testing the whole exercise today on the TA machine so we can see how long it takes.

adjavon · 2024-08-19T16:34:31Z

Notes while running the exercise on the TA machine

Something seems wrong with the normalization of the confusion matrices when I run this; it taps out at 30 and therefore the colors don't really match the numbers. For instance in the example below the cells on row 4 that have values 45, 17, and 34 are colored the same way as the diagonal cells that have values >90.

Generally, the TQDM progress bars disappear after each epoch is done and turn into an odd recap (see first line). Not sure if this is on purpose as I've never seen this before, but ignore if it is!

The first denoising network takes about 40s per epoch to train on the TA machine -- just the right amount of time for me to go make myself a coffee -- perfectly reasonable :)
The second denoising network (on the concatenated dataset) takes 1min30 seconds per epoch on the TA machine; falls into the longer side but still generally reasonable.
Same for the third denoising network (shuffled combined). I am not against keeping it as it does provide an important learning point (i.e. networks forget)
A suggestion: it would be nice to have a little "Conclusion" at the final checkpoint about key takeaways

I noticed a couple typos and trivial things that I can fix so I'll make a PR this afternoon for those :)

cmalinmayor · 2024-08-19T17:07:24Z

Something seems wrong with the normalization of the confusion matrices when I run this; it taps out at 30 and therefore the colors don't really match the numbers. For instance in the example below the cells on row 4 that have values 45, 17, and 34 are colored the same way as the diagonal cells that have values >90.

It seems like it is coloring on absolute numbers, not on percents. Weird!! We should fix.

adjavon · 2024-08-19T17:58:22Z

Something seems wrong with the normalization of the confusion matrices when I run this; it taps out at 30 and therefore the colors don't really match the numbers. For instance in the example below the cells on row 4 that have values 45, 17, and 34 are colored the same way as the diagonal cells that have values >90.

It seems like it is coloring on absolute numbers, not on percents. Weird!! We should fix.

~~Will try to fix in my PR~~ Done

Remaining fixes after review of updates

afoix · 2024-08-19T23:49:50Z

@adjavon @cmalinmayor thank you for your help! I am merging this now onto main ready for the course 🥳

afoix added 9 commits July 25, 2024 00:06

Added answers

ad2bc3b

more answers

d5ff5ce

Added README

4231176

removed axis from images

baa9fa9

removed axis from images in the exercise

3f19b62

fixed 5.2 -> 5.3

e8f9ad9

added two tasks to train the denoiser on both MNIST and FashionMNIST …

bb724b8

…(shuffle off, then on)

Added the 2 new tasks to the exercies notebook

8bfef70

removed the now implemented bonus question cell

8707bb2

afoix requested review from cmalinmayor and adjavon July 25, 2024 09:18

afoix and others added 19 commits August 17, 2024 12:51

remove colored_mnist and add dlmbl-unet packages

26faa1f

fix classification package install in setup

d0037c2

switch to vanilla tqdm

bd21329

added new dense model and unet

31210c9

updated the exercise

5892413

Added missing import to exercise

f42e779

added missing import to solution

f384125

use bilinear upsampling in the unet

76eaa2a

Add github action for building notebooks

93b4699

cleared outputs from solution notebook

e01ec02

Added solution.py

b4665d9

Commit from GitHub Actions (Build Notebooks)

e4173be

dded solution tag to all answers in solution.py

8ae095e

Commit from GitHub Actions (Build Notebooks)

bfab2c3

Fix section numbering issue

9431399

Commit from GitHub Actions (Build Notebooks)

1b1713e

Try to fix empty cells

7377642

Commit from GitHub Actions (Build Notebooks)

3dce87c

Other attempt at fixing solution.py

1fcfd04

cmalinmayor added 5 commits August 18, 2024 09:51

Remove jupyter lab from README

8f579c7

Make the intro in README a little shorter

a73770d

Use conda in setup.sh

2f69b54

Add -y to conda command in setup.sh

ac50c96

Add gitignore for data directories

c201d2a

cmalinmayor and others added 3 commits August 18, 2024 16:29

Add a lot of newlines to stop cells merging together

e75177f

Use percent format to prevent cells merging together

ce04a07

Commit from GitHub Actions (Build Notebooks)

ca7f24c

cmalinmayor reviewed Aug 18, 2024

View reviewed changes

afoix and others added 4 commits August 19, 2024 01:03

Merge pull request #16 from dlmbl/cmm_edits

7fd2f36

Suggested Changes

Commit from GitHub Actions (Build Notebooks)

a3de073

change tqdm.auto to only tqdm because of printing problems in jupyter

8d26f5e

Commit from GitHub Actions (Build Notebooks)

862c65f

adjavon added 2 commits August 19, 2024 13:55

Fix typos

d2ecece

Remove references to Element

92eb6b6

adjavon and others added 7 commits August 19, 2024 14:07

Remove hard-coded CUDA

d77a454

Fix confusion matrix visualization

878b60e

Commit from GitHub Actions (Build Notebooks)

41e70b4

Add description of the model forgetting

98cd0a0

Commit from GitHub Actions (Build Notebooks)

65adfb3

Merge pull request #17 from dlmbl/da_updates

e58a66e

Remaining fixes after review of updates

Commit from GitHub Actions (Build Notebooks)

f7963b7

afoix merged commit c4f083b into main Aug 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update exercise for 2024 with feedback form 2023 #15

Update exercise for 2024 with feedback form 2023 #15

afoix commented Jul 25, 2024 •

edited

Loading

cmalinmayor commented Aug 18, 2024

cmalinmayor left a comment

cmalinmayor Aug 18, 2024

afoix Aug 19, 2024

adjavon Aug 19, 2024

afoix commented Aug 19, 2024

adjavon commented Aug 19, 2024

adjavon commented Aug 19, 2024 •

edited

Loading

cmalinmayor commented Aug 19, 2024

adjavon commented Aug 19, 2024 •

edited

Loading

afoix commented Aug 19, 2024

Update exercise for 2024 with feedback form 2023 #15

Update exercise for 2024 with feedback form 2023 #15

Conversation

afoix commented Jul 25, 2024 • edited Loading

cmalinmayor commented Aug 18, 2024

cmalinmayor left a comment

Choose a reason for hiding this comment

cmalinmayor Aug 18, 2024

Choose a reason for hiding this comment

afoix Aug 19, 2024

Choose a reason for hiding this comment

adjavon Aug 19, 2024

Choose a reason for hiding this comment

afoix commented Aug 19, 2024

adjavon commented Aug 19, 2024

adjavon commented Aug 19, 2024 • edited Loading

cmalinmayor commented Aug 19, 2024

adjavon commented Aug 19, 2024 • edited Loading

afoix commented Aug 19, 2024

afoix commented Jul 25, 2024 •

edited

Loading

adjavon commented Aug 19, 2024 •

edited

Loading

adjavon commented Aug 19, 2024 •

edited

Loading