如何对Qwen的量化版本模型做sft及dpo #6573

gmm41 · 2025-01-09T02:18:46Z

Reminder

I have read the README and searched the existing issues.

System Info

想对Qwen2.5-14B-Instruct-GPTQ-Int4模型先做sft再dpo，因为base模型是量化模型，sft之后不能做base和lora的合并，请问是在dpo的脚本里添加adaper_name_or_path: sft_lora参数就可以吗，如果是这样，在推理阶段，adaper_name_or_path只用dpo的lora就可以吗
看了之前的issure有类似的情况，但是base都不是量化版本的

Reproduction

Put your message here.

Others

No response

The text was updated successfully, but these errors were encountered:

hiyouga · 2025-01-09T14:08:03Z

是的

github-actions bot added the pending This problem is yet to be addressed label Jan 9, 2025

hiyouga closed this as completed Jan 9, 2025

hiyouga added solved This problem has been already solved and removed pending This problem is yet to be addressed labels Jan 9, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

如何对Qwen的量化版本模型做sft及dpo #6573

如何对Qwen的量化版本模型做sft及dpo #6573

gmm41 commented Jan 9, 2025

hiyouga commented Jan 9, 2025

如何对Qwen的量化版本模型做sft及dpo #6573

如何对Qwen的量化版本模型做sft及dpo #6573

Comments

gmm41 commented Jan 9, 2025

Reminder

System Info

Reproduction

Others

hiyouga commented Jan 9, 2025