Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Very long training time #16

Open
david-az opened this issue Dec 7, 2021 · 1 comment
Open

Very long training time #16

david-az opened this issue Dec 7, 2021 · 1 comment

Comments

@david-az
Copy link

david-az commented Dec 7, 2021

Hi, thank you for your great work.

The number of FLOPs and the numbers of parameters are less than Swin Transformer, however the training time of HRFormer is at least 2 times longer than Swin, and 3 times longer than HRNet.
I guess the gradient calculation is very long because of a lot of reshape operations ? Is there a way to optimize that ?

Thank you

@PkuRainBow
Copy link
Collaborator

@david-az Good question!

Currently, we do not have any plans or solutions to optimize the training time cost of HRFormer.
You can find that our HRFormer already chooses a much smaller network depth compared to the original HRNet.
Any suggestions on optimizing the training time are WELCOME!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants