You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello! I meet a problem when i am reading your code; Specifically, that is the implementation of 'lr_mul' trick.
your implementation of it is as follows: param_groups = [ {'params': base_params, 'lr_mult': 0.0}, {'params': new_params, 'lr_mult': 1.0}]
And in the open-source code of your MS loss :
the implementation of it is as follows: def build_optimizer(cfg, model): params = [] for key, value in model.named_parameters(): if not value.requires_grad: continue lr_mul = 1.0 if "backbone" in key: lr_mul = 0.1 params += [{"params": [value], "lr_mul": lr_mul}] optimizer = getattr(torch.optim, cfg.SOLVER.OPTIMIZER_NAME)(params, lr=cfg.SOLVER.BASE_LR, weight_decay=cfg.SOLVER.WEIGHT_DECAY) return optimizer
but in https://pytorch.org/docs/master/optim.html, the implementation of 'lr_mul' is as follows optim.SGD([ {'params': model.base.parameters()}, {'params': model.classifier.parameters(), 'lr': 1e-3} ], lr=1e-2, momentum=0.9)
I want to know is there any difference between 'lr_mul' and 'lr_mult'? I can find any information of that. And in the document of pytorch framework, I cannot find any explain of 'lr_mul' and 'lr_mult'. (I find that the 'lr_mult' is used in caffe framework.) I'm confused of this problem and I need your help.
Thank you!
The text was updated successfully, but these errors were encountered:
Hello! I meet a problem when i am reading your code; Specifically, that is the implementation of 'lr_mul' trick.
your implementation of it is as follows:
param_groups = [ {'params': base_params, 'lr_mult': 0.0}, {'params': new_params, 'lr_mult': 1.0}]
And in the open-source code of your MS loss :
the implementation of it is as follows:
def build_optimizer(cfg, model): params = [] for key, value in model.named_parameters(): if not value.requires_grad: continue lr_mul = 1.0 if "backbone" in key: lr_mul = 0.1 params += [{"params": [value], "lr_mul": lr_mul}] optimizer = getattr(torch.optim, cfg.SOLVER.OPTIMIZER_NAME)(params, lr=cfg.SOLVER.BASE_LR, weight_decay=cfg.SOLVER.WEIGHT_DECAY) return optimizer
but in https://pytorch.org/docs/master/optim.html, the implementation of 'lr_mul' is as follows
optim.SGD([ {'params': model.base.parameters()}, {'params': model.classifier.parameters(), 'lr': 1e-3} ], lr=1e-2, momentum=0.9)
I want to know is there any difference between 'lr_mul' and 'lr_mult'? I can find any information of that. And in the document of pytorch framework, I cannot find any explain of 'lr_mul' and 'lr_mult'. (I find that the 'lr_mult' is used in caffe framework.) I'm confused of this problem and I need your help.
Thank you!
The text was updated successfully, but these errors were encountered: