generated from fastai/nbdev_template
-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Issues: huggingface/trl
[Tracking issue] Integrate native liger-kernel losses
#2495
opened Dec 17, 2024 by
qgallouedec
Open
2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
Multi-node training with deepspeed launcher
🚀 deepspeed
Related to deepspeed
🏋 SFT
Related to SFT
#2605
opened Jan 22, 2025 by
ghtaro
5 tasks done
Loss tends to increase when the gradient accumulation steps are increased while using the SFTTrainer
⚡ PEFT
Related to PEFT
🏋 SFT
Related to SFT
#2604
opened Jan 22, 2025 by
Asnegha
5 tasks done
[Question] DataCollatorForCompletionOnlyLM with dynamic padding?
❓ question
Seeking clarification or more information
#2603
opened Jan 22, 2025 by
katzurik
[Question] Log eval metrics performed during training to files
📚 documentation
Improvements or additions to documentation
❓ question
Seeking clarification or more information
#2602
opened Jan 22, 2025 by
skandermoalla
Add the training method for DeepSeek-R1
✨ enhancement
New feature or request
#2599
opened Jan 21, 2025 by
MohamedAliRashad
Potential bug in PPO Trainer
🐛 bug
Something isn't working
🏋 PPO
Related to PPO
#2596
opened Jan 21, 2025 by
kyleliang919
5 tasks done
PRM Performance on Different Data Type
🏋 PRM
Related to PRM
❓ question
Seeking clarification or more information
#2591
opened Jan 19, 2025 by
TanZhendong
wandb step slider implementation in example notebook
❓ question
Seeking clarification or more information
#2589
opened Jan 18, 2025 by
stellaludai
GKD trainer doesn't work too well with the llama series
🐛 bug
Something isn't working
🏋 GKD
Related to GKD
#2586
opened Jan 17, 2025 by
Omar-Deepshard
5 tasks done
GKDTrainer + FSDP results in RuntimeError: Expected all tensors to be on the same device, but found at least two devices
🐛 bug
Something isn't working
🏋 GKD
Related to GKD
#2580
opened Jan 17, 2025 by
sl5035
7 of 9 tasks
confused with function generation(), https://github.com/huggingface/trl/blob/main/trl/trainer/utils.py#L1323
#2579
opened Jan 17, 2025 by
luoyingyan
Make PPOTrainer compatible with PRMs
✨ enhancement
New feature or request
🏋 PPO
Related to PPO
#2577
opened Jan 16, 2025 by
kyleliang919
ValueError: Found unknown kwargs when loading DbrxForCausalLM
#2574
opened Jan 16, 2025 by
qgallouedec
7 of 9 tasks
ORPO on SFT dataset
🏋 ORPO
Related to ORPO
❓ question
Seeking clarification or more information
#2570
opened Jan 15, 2025 by
vitalyshalumov
7 of 9 tasks
RuntimeError: Function 'Log1PBackward0' returned nan values in its 0th output.
🐛 bug
Something isn't working
🏋 ORPO
Related to ORPO
#2564
opened Jan 13, 2025 by
zhaoxjmail
7 of 9 tasks
finetune a very small 0.5B qwen2.5 model with method of pissa on 2 *A800 (80G each, 120G available ) strangely met with OOM error
🐛 bug
Something isn't working
🏋 SFT
Related to SFT
#2559
opened Jan 10, 2025 by
chuangzhidan
8 of 9 tasks
Problem with accelerate>=1.0.0 when running official PPO/RLOO examples
⚡accelerate
Related to accelerate
🏋 PPO
Related to PPO
🏋 RLOO
Related to RLOO
#2555
opened Jan 10, 2025 by
dawidm
7 of 9 tasks
Finetuning on the last turn of multi-turn conversations
❓ question
Seeking clarification or more information
🏋 SFT
Related to SFT
#2545
opened Jan 6, 2025 by
okhat
SFTTrainer explicitly skips Related to deepspeed
⚡ PEFT
Related to PEFT
🏋 SFT
Related to SFT
prepare_model_for_kbit_training
if using PEFT + FSDP/Deepspeed3 whereas DPOTrainer calls this
🚀 deepspeed
#2537
opened Jan 2, 2025 by
alexdauenhauer
7 of 9 tasks
Different finetune speed in DPO task of peft and ms-swift (600/S iter vs 30/s iter)
🏋 DPO
Related to DPO
🙋 help from community wanted
Open invitation for community members to contribute
⚡ PEFT
Related to PEFT
#2536
opened Jan 2, 2025 by
maoulee
7 of 9 tasks
(Willing to PR) Will it be welcomed if speeding up algorithms like PPO and code refactor/cleanup?
🏋 PPO
Related to PPO
❓ question
Seeking clarification or more information
🏋 RLOO
Related to RLOO
#2535
opened Dec 31, 2024 by
fzyzcjy
onlinedpo error when use deepspeed zero3
🐛 bug
Something isn't working
🚀 deepspeed
Related to deepspeed
⏳ needs more info
Additional information or clarification is required to proceed
🏋 Online DPO
Related to Online DPO
#2532
opened Dec 30, 2024 by
yiyepiaoling0715
5 of 9 tasks
PPOTrainer: num_mini_batches setting affects training progress bar in an unexpected way
🐛 bug
Something isn't working
🏋 PPO
Related to PPO
#2530
opened Dec 29, 2024 by
dawidm
6 of 9 tasks
Previous Next
ProTip!
Adding no:label will show everything without a label.