-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Different finetune speed in DPO task of peft and ms-swift (600/S iter vs 30/s iter) #2536
Comments
It's not very clear what code you're using. Because you seem to be using a command ( |
Here is the trl env info:
To print trl env: The map_instruction function is used to map the dataset. Here is the complete code: def map_instruction(example): def main():
)
if name == "main": |
I was able to reproduce the speed. I don't know how swift is different form trl (it's built upon trl as far as I understand). You should probably ask swift community here |
Thank you for your response. I have identified the key issue: When I load the model and pass the peft_config directly into DPOTrainer, the fine-tuning speed is 600 seconds per iteration. However, when I use model = get_peft_model(model, peft_config) before passing it to the trainer, the fine-tuning speed improves significantly to 30.2 seconds per iteration. The logic of the two seems to be the same, but the speed difference is large. |
It's probably because when you pass a peft model, it gets merged and unload ( |
System Info
transformers
version: 4.45.0Information
Tasks
examples
folderReproduction
Optimized Problem Description in English:
Swift CLI Configuration:
USE_HF=1 CUDA_VISIBLE_DEVICES=0,1 swift rlhf \ --rlhf_type dpo \ --model_type qwen2_5 \ --model /root/.cache/modelscope/hub/unsloth/Qwen2___5-32B-Instruct-bnb-4bit/ \ --train_type lora \ --tuner_backend peft \ --dataset llamafactory/ultrafeedback_binarized#2000 \ --num_train_epochs 2 \ --learning_rate 5e-6 \ --lora_rank 8 \ --lora_alpha 32 \ --gradient_accumulation_steps 16 \ --gradient_checkpointing_kwargs '{"use_reentrant": false}' \ --eval_steps 100 \ --save_steps 100 \ --save_total_limit 2 \ --lora_dropout 0.05 \ --logging_steps 100 \ --quant_method bnb \ --quant_bit 4 \ --max_new_tokens 1500
Fine-tuning Speed:
PEFT Configuration:
Fine-tuning Speed:
Expected behavior
Is there something wrong with my peft setup?
Checklist
The text was updated successfully, but these errors were encountered: