Very long training time #16

david-az · 2021-12-07T10:11:04Z

Hi, thank you for your great work.

The number of FLOPs and the numbers of parameters are less than Swin Transformer, however the training time of HRFormer is at least 2 times longer than Swin, and 3 times longer than HRNet.
I guess the gradient calculation is very long because of a lot of reshape operations ? Is there a way to optimize that ?

Thank you

PkuRainBow · 2021-12-08T14:36:49Z

@david-az Good question!

Currently, we do not have any plans or solutions to optimize the training time cost of HRFormer.
You can find that our HRFormer already chooses a much smaller network depth compared to the original HRNet.
Any suggestions on optimizing the training time are WELCOME!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Very long training time #16

Very long training time #16

david-az commented Dec 7, 2021

PkuRainBow commented Dec 8, 2021

Very long training time #16

Very long training time #16

Comments

david-az commented Dec 7, 2021

PkuRainBow commented Dec 8, 2021