-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Model Parallelism with SFTTrainer #1094
Comments
hi @pharringtonp19 |
@younesbelkada Thanks! Is this the most efficient way to train across a cluster of small & old gpus (2080Ti)? I usually run out of memory. |
In addition to using PEFT (quantization + LoRA) I would consider looking into DeepSpeed. It's fully supported with |
@lvwerra Thanks for the suggestion! |
Perhaps this isn't the right place to ask this question, but what's the easiest way to setup model parallelism using SFTTrainer?
My understanding is that if we have access to a multi-gpu workstation, the default is data parallelism. However, I would be interested in comparing run times to model parallelism.
Thanks!
The text was updated successfully, but these errors were encountered: