-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Could you provide example code for AdaLoRA finetuning decoder-only model? #2262
Comments
As there is response yet, I am trying to answer your question how I would approach this:
My idea would be to work with the special tokens in the prompt. Depending on the model you are using (the prompt template varies across different model architectures), I would look for the |
Thank you for your response. Yes, I have successfully written a custom code to set the labels of tokens before |
I'm glad to hear it worked, thanks for your suggestion @d-kleine. If there are no further questions, feel free to close the issue @SpeeeedLee. |
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. |
Feature request
The current example of AdaLoRA is on facebook/bart-base. Since AdaLoRA requires hand-crafted calculations on loss, would it be possible to provide me some hints on how can this be done when it comes to decoder-only (e.g., Llama-Instruct) LM?
Specificially, I would like to mask out the loss calculation on the instruction part or system prompt, focusing only on the assistant response.
Motivation
AdaLoRA requires hand-crafted calculations on loss, which becomes complex when desired to mask out some system/instructino tokens.
Your contribution
N.A.
The text was updated successfully, but these errors were encountered: