Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regarding the baseline performance and hyperparameters #4

Open
Arnav0400 opened this issue Mar 4, 2024 · 0 comments
Open

Regarding the baseline performance and hyperparameters #4

Arnav0400 opened this issue Mar 4, 2024 · 0 comments

Comments

@Arnav0400
Copy link

Hello Authors,

Thanks for your work and this codebase, I had some questions regarding your implementation and hyperparameters of baselines like LoRA.
Specifically -

  1. What is the rank in LoRA, which layers have a LoRA branch (Attention or MLP), does adding LoRA to all linear layers improve the performance (as shown in recent works), any ablations done on basic LoRA for visual tasks?
  2. Since LoRA is completely re-parameterizable post-tuning, it should be included in NO extra structure section.

I am planning on extending your work for much smaller models and hence any insights would be really appreciated.

Regards,
Arnav

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant