SFTTrainer explicitly skips prepare_model_for_kbit_training
if using PEFT + FSDP/Deepspeed3 whereas DPOTrainer calls this
#2537
Labels
System Info
Information
Tasks
examples
folderReproduction
difference in how each trainer handles PEFT + FSDP
sft:
https://github.com/huggingface/trl/blob/v0.12.1/trl/trainer/sft_trainer.py#L242-L244
dpo:
https://github.com/huggingface/trl/blob/v0.12.1/trl/trainer/dpo_trainer.py#L363
Expected behavior
Currently workflow is:
it is unclear what is the expected way to create and pass a PEFT model to the trainer when also using FSDP for model parallel training since both SFTTrainer and DPOTrainer handle this differently.
Checklist
The text was updated successfully, but these errors were encountered: