-
Notifications
You must be signed in to change notification settings - Fork 32k
Pull requests: huggingface/transformers
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Fix DeepSpeed model preparation logic in Trainer class
#43780
opened Feb 5, 2026 by
qgallouedec
Loading…
SwanLab: Add support for id and resume arguments in SwanLabCallback
#43779
opened Feb 5, 2026 by
surya10602
Loading…
1 of 5 tasks
fix(moe): normalize auxiliary loss by top_k for correct load balancing
#43775
opened Feb 5, 2026 by
Mr-Neutr0n
Loading…
fix: Add MXFP4 MoE/attention backward kernels
#43771
opened Feb 5, 2026 by
leoneperdigao
Loading…
4 tasks done
Remove unconditional train_batch_size assignment
#43770
opened Feb 5, 2026 by
lordaarush
Loading…
2 of 5 tasks
Fix
convert_rope_params_to_dict so it uses rope_theta from the config
#43766
opened Feb 5, 2026 by
hmellor
Loading…
[core] Faster and thread-safe
check_model_inputs implementation
#43765
opened Feb 5, 2026 by
Cyrilvallez
Loading…
Modify ModernBERT's default attention implementation to stop using FA
#43764
opened Feb 5, 2026 by
YangKai0616
Loading…
Avoid hard failure for gpt-oss GGUF architecture by falling back to g…
#43757
opened Feb 5, 2026 by
TheSanjBot
Loading…
2 of 5 tasks
Update KERNELS_MIN_VERSION to 0.10.2 to be the same as setup.py
#43753
opened Feb 5, 2026 by
cyyever
Loading…
Previous Next
ProTip!
Add no:assignee to see everything that’s not assigned.