New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Regarding the baseline performance and hyperparameters #4

Open

Arnav0400 opened this issue Mar 4, 2024 · 0 comments

Arnav0400 commented Mar 4, 2024

Hello Authors,

Thanks for your work and this codebase, I had some questions regarding your implementation and hyperparameters of baselines like LoRA.
Specifically -

What is the rank in LoRA, which layers have a LoRA branch (Attention or MLP), does adding LoRA to all linear layers improve the performance (as shown in recent works), any ablations done on basic LoRA for visual tasks?
Since LoRA is completely re-parameterizable post-tuning, it should be included in NO extra structure section.

I am planning on extending your work for much smaller models and hence any insights would be really appreciated.

Regards,
Arnav

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment