-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
state_dict keys don't match load_state_dict #865
Comments
I remember @younesbelkada had to work on that to make the models work in |
It looks like you are trying to load from a peft model state dict? in that case you only need to load the v_head as all other parameters are kept untouched right? model.load_state_dict(model.state_dict(), strict=False) Should do the trick |
Actually I think all parameters are trained, not only the v_head, since I'm using the language modelling as the policy (im not doing RLHF following the tutorials, I'm doing research on RL with LLMs, I have a custom training loop).
I'm now wondering what was the reason to change it in the first place. |
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. |
I'm using the AutoModelForCausalLMWithValueHead with a custom training loop, I was trying to copy the weights of the model to a target model with the same architecture when I noticed that the keys from state_dict don't match those for load_state_dict
apparently it's just a key error, and modifying the state_dict function solves it
I don't know if this was done to make some high level api work, but I feel like the basic api from pytorch should still be made compatible
The text was updated successfully, but these errors were encountered: