You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks for your work and this codebase, I had some questions regarding your implementation and hyperparameters of baselines like LoRA.
Specifically -
What is the rank in LoRA, which layers have a LoRA branch (Attention or MLP), does adding LoRA to all linear layers improve the performance (as shown in recent works), any ablations done on basic LoRA for visual tasks?
Since LoRA is completely re-parameterizable post-tuning, it should be included in NO extra structure section.
I am planning on extending your work for much smaller models and hence any insights would be really appreciated.
Regards,
Arnav
The text was updated successfully, but these errors were encountered:
Hello Authors,
Thanks for your work and this codebase, I had some questions regarding your implementation and hyperparameters of baselines like LoRA.
Specifically -
I am planning on extending your work for much smaller models and hence any insights would be really appreciated.
Regards,
Arnav
The text was updated successfully, but these errors were encountered: