-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update exercise for 2024 with feedback form 2023 #15
Conversation
…(shuffle off, then on)
@afoix How long did it take to train the denoising model for you? On my M3 mac (no GPU, but usually quite speedy with pytorch) its over half an hour. We should test on the actual hardware, because if it is that slow we need to make it faster. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
||
# ### Train the denoiser on both MNIST and FashionMNIST | ||
# | ||
# In this section, we will perform the denoiser training once again, but this time on both MNIST and FashionMNIST datasets, and then try to apply the newly trained denoiser to a set of noisy test images. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to think a bit more on what the point of this part is. Do we actually want them to train sequentially and then shuffled? I think it would be fine to just train shuffled and show that it works and is the correct way to get good (ish?) performance on both. If we want to include a sequential example, I think it would make more sense as essentially a fine-tuning procedure, where you fully train on MNIST and then FashionMNIST, which would show how fine-tuned networks forget their initial task. The way it is implemented now, both datasets are included in both epochs so it is not really sequential or shuffled, and I find it confusing what exactly we are trying to teach them. Also, if the denoiser takes longer than 10 minutes to train, we should definitely skip sequential.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @cmalinmayor the denoiser definitely takes less to train at least in my machine with a GPU 😄 However, I am happy to skip the sequential part if you both prefer it to be removed. Let me know! :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a comment below, but long story short on the AWS hardware it is just fast enough to not be too annoying. I like having both because I like the lesson I learn from it (networks forget). I can have a go at making that a bit more on-the-nose in the text during my afternoon run typos-and-things PR :)
Hello @cmalinmayor in my machine, using a GPU each epoch is only 25 seconds, and it's 5 epochs so it's actually quite fast 😄 |
I'm testing the whole exercise today on the TA machine so we can see how long it takes. |
It seems like it is coloring on absolute numbers, not on percents. Weird!! We should fix. |
|
Remaining fixes after review of updates
@adjavon @cmalinmayor thank you for your help! I am merging this now onto main ready for the course 🥳 |
This pull request includes: