huggingface / transformers Public

Notifications You must be signed in to change notification settings
Fork 32k
Star 156k

Code
Issues 1.1k
Pull requests 1.1k
Actions
Projects 1
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Actions
Projects
Security
Insights

Pull requests: huggingface/transformers

Labels 137 Milestones 0

New pull request New

1,129 Open 23,516 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

Fix DeepSpeed model preparation logic in Trainer class

#43780 opened Feb 5, 2026 by qgallouedec

Loading…

SwanLab: Add support for id and resume arguments in SwanLabCallback

#43779 opened Feb 5, 2026 by surya10602

Loading…

1 of 5 tasks

Mamba-1/-2 init weights in mixer class

#43778 opened Feb 5, 2026 by kevinli573

Loading…

Bump dev version

#43777 opened Feb 5, 2026 by qgallouedec

Loading…

5 tasks

Refactor trainer data_colllator and callbacks tests

#43776 opened Feb 5, 2026 by SunMarc

Loading…

fix(moe): normalize auxiliary loss by top_k for correct load balancing

#43775 opened Feb 5, 2026 by Mr-Neutr0n

Loading…

Add activation offloading to trainer

#43774 opened Feb 5, 2026 by mbtariq82

Loading…

4 of 5 tasks

Fix-release-ubild

#43773 opened Feb 5, 2026 by ArthurZucker

Loading…

[Modular Dependencies] Fixup qwen rms norms

#43772 opened Feb 5, 2026 by vasqu

Loading…

fix: Add MXFP4 MoE/attention backward kernels

#43771 opened Feb 5, 2026 by leoneperdigao

Loading…

4 tasks done

Remove unconditional train_batch_size assignment

#43770 opened Feb 5, 2026 by lordaarush

Loading…

2 of 5 tasks

Add Voxtral Realtime Audio New model

#43769 opened Feb 5, 2026 by eustlb

Loading…

Fix init weights in remote code

#43768 opened Feb 5, 2026 by zucchini-nlp

Loading…

[Model] Add PP-Chart2Table Model Support

#43767 opened Feb 5, 2026 by XingweiDeng

Loading…

5 tasks

Fix convert_rope_params_to_dict so it uses rope_theta from the config

#43766 opened Feb 5, 2026 by hmellor

Loading…

[core] Faster and thread-safe check_model_inputs implementation

#43765 opened Feb 5, 2026 by Cyrilvallez

Loading…

Modify ModernBERT's default attention implementation to stop using FA

#43764 opened Feb 5, 2026 by YangKai0616

Loading…

Improved agents

#43763 opened Feb 5, 2026 by tarekziade

Loading…

Widen match condition for _can_record_outputs

#43762 opened Feb 5, 2026 by molbap

Loading…

Avoid hard failure for gpt-oss GGUF architecture by falling back to g…

#43757 opened Feb 5, 2026 by TheSanjBot

Loading…

2 of 5 tasks

Ernie4 5 vl moe

#43755 opened Feb 5, 2026 by kaixuanliu

Loading…

Update KERNELS_MIN_VERSION to 0.10.2 to be the same as setup.py

#43753 opened Feb 5, 2026 by cyyever

Loading…

Param2moe v4.52.3

#43752 opened Feb 5, 2026 by bhargav-patel-29

Loading…

2 of 5 tasks

Fix ruff warnings

#43751 opened Feb 5, 2026 by cyyever

Loading…

enable tp for benchmark

#43750 opened Feb 5, 2026 by sywangyi

Loading…

Previous 1 2 3 4 5 … 45 46 Next

Previous Next

ProTip! Add no:assignee to see everything that’s not assigned.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!