You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The number of FLOPs and the numbers of parameters are less than Swin Transformer, however the training time of HRFormer is at least 2 times longer than Swin, and 3 times longer than HRNet.
I guess the gradient calculation is very long because of a lot of reshape operations ? Is there a way to optimize that ?
Thank you
The text was updated successfully, but these errors were encountered:
Currently, we do not have any plans or solutions to optimize the training time cost of HRFormer.
You can find that our HRFormer already chooses a much smaller network depth compared to the original HRNet.
Any suggestions on optimizing the training time are WELCOME!
Hi, thank you for your great work.
The number of FLOPs and the numbers of parameters are less than Swin Transformer, however the training time of HRFormer is at least 2 times longer than Swin, and 3 times longer than HRNet.
I guess the gradient calculation is very long because of a lot of reshape operations ? Is there a way to optimize that ?
Thank you
The text was updated successfully, but these errors were encountered: