You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
The lr_scheduler takes effect after optimizer steps. This is a logical error, since the learning rate of the first optimization step is not generated by lr_scheduler. Concretely, if I initialize with both an optimizer with lr = x and a WarmupDecayLR scheduler with warmup_min_lr=y, then the first step learning rate would be x (instead of y), and then the second step learning rate would be y.
Describe the bug
The lr_scheduler takes effect after optimizer steps. This is a logical error, since the learning rate of the first optimization step is not generated by lr_scheduler. Concretely, if I initialize with both an optimizer with lr = x and a WarmupDecayLR scheduler with warmup_min_lr=y, then the first step learning rate would be x (instead of y), and then the second step learning rate would be y.
Perm link:
https://github.com/microsoft/DeepSpeed/blob/3d347276ce80e1a29e777c839d1d7fabe8e5f034/deepspeed/runtime/engine.py#L2109C28-L2109C64
The text was updated successfully, but these errors were encountered: