-
-
Notifications
You must be signed in to change notification settings - Fork 295
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Major BiC fix/reimplementation #1550
Conversation
Pull Request Test Coverage Report for Build 7652723163Warning: This coverage report may be inaccurate.We've detected an issue with your CI configuration that might affect the accuracy of this pull request's coverage report.
💛 - Coveralls |
Thanks, the code looks ok. I see that we are missing a BiC reproducibility example in continual-learning-baselines. If you have time maybe you can add something there too. |
I'm trying to reproduce a CIFAR experiment. Alas, I also found a bug in the |
The results are far better than the ones obtained by using the previous version. Once tests are finished, I recommend merging. |
@AntonioCarta In the code I used |
The |
I switched to Apart from the Python 3.11 tests (which, if I'm correct, are still being fixed), everything works fine. |
Everything is ok. Just a minor comment about the test change. |
This PR is a major re-implementation/global fix for BiC (paper)
The implementation found in Avalanche (and in other CL libraries as well) has various issues, mainly connected to the understanding of how the algorithm works. It's an honest mistake that I made too when reading the paper.
The official implementation of BiC can be found here (TF 1.x :/ ): https://github.com/wuyuebupt/LargeScaleIncrementalLearning/tree/master
BiC works like that:
The main issue in Avalanche implementation is that BiC does not actually keep a separate bias correction layer for each experience, it only keeps single alpha and beta parameters (a single bias correction layer) which is then applied only to the "last" experience.
In practice:
current_experience-1
. Bias correction is not applied to other classes (such as the ones incurrent_experience-t
t>=2, norcurrent_experience
. Bias correction is not applied to the activations from the current model. The usual classification loss is applied on the logits of the current model without bias correction.current_experience
current_experience
What Avalanche was doing (and I suspect FACIL too) is to keep a separate (permanent) bias correction layer for each experience. Those were then used in all phases (computing the distillation loss, classification loss, bias correction for new classes, test phase).
I noticed these problems when fixing other accidental bugs related to the freezing of the old model and bias layers, which are of course fixed in this version :).
To recap:
current_experience-1
, ...)