Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SFTTrainer explicitly skips prepare_model_for_kbit_training if using PEFT + FSDP/Deepspeed3 whereas DPOTrainer calls this #2537

Open
7 of 9 tasks
alexdauenhauer opened this issue Jan 2, 2025 · 1 comment
Labels
🚀 deepspeed Related to deepspeed ⚡ PEFT Related to PEFT 🏋 SFT Related to SFT

Comments

@alexdauenhauer
Copy link

System Info

  • Platform: Linux-5.15.0-1061-gke-x86_64-with-glibc2.31
  • Python version: 3.11.9
  • PyTorch version: 2.4.0
  • CUDA device(s): NVIDIA A100-SXM4-80GB
  • Transformers version: 4.46.3
  • Accelerate version: 1.0.1
  • Accelerate config: not found
  • Datasets version: 3.0.2
  • HF Hub version: 0.27.0
  • TRL version: 0.12.1
  • bitsandbytes version: 0.44.1
  • DeepSpeed version: not installed
  • Diffusers version: not installed
  • Liger-Kernel version: not installed
  • LLM-Blender version: not installed
  • OpenAI version: 1.58.1
  • PEFT version: 0.13.2

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder
  • My own task or dataset (give details below)

Reproduction

difference in how each trainer handles PEFT + FSDP

sft:
https://github.com/huggingface/trl/blob/v0.12.1/trl/trainer/sft_trainer.py#L242-L244

dpo:
https://github.com/huggingface/trl/blob/v0.12.1/trl/trainer/dpo_trainer.py#L363

Expected behavior

Currently workflow is:

  • create PEFT model outside of trainer
  • pass PEFT model to trainer
  • first run SFTTrainer
  • use output model from SFTTrainer as base model in DPOTrainer

it is unclear what is the expected way to create and pass a PEFT model to the trainer when also using FSDP for model parallel training since both SFTTrainer and DPOTrainer handle this differently.

Checklist

  • I have checked that my issue isn't already filed (see open issues)
  • I have included my system information
  • Any code provided is minimal, complete, and reproducible (more on MREs)
  • Any code provided is properly formatted in code blocks, (no screenshot, more on code blocks)
  • Any traceback provided is complete
@August-murr August-murr added 🏋 SFT Related to SFT ⚡ PEFT Related to PEFT 🚀 deepspeed Related to deepspeed labels Jan 3, 2025
@edbeeching
Copy link
Collaborator

Thanks for highlighting this difference. Is your current work impacted by this? Or is it just that you are required to instantiate the peft model outside of the training?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🚀 deepspeed Related to deepspeed ⚡ PEFT Related to PEFT 🏋 SFT Related to SFT
Projects
None yet
Development

No branches or pull requests

3 participants