Change bias initialization from 'embed' to 'heads' by csgoogle · Pull Request #371 · AI-Hypercomputer/maxdiffusion

csgoogle · 2026-04-06T10:09:51Z

Fix the bias sharding axis, it should be output axis instead of input one.

Results

Metric	`main`	`fixbiassharding`	Δ
Compile time	1913.9s	1906.4s	-7.5s
Inference time	1656.4s	1642.1s	-14.3s (-0.9%)

Notes

No difference observed with tp=1 configs — improvement only surfaces when tensor parallelism is active, as the axis fixes reduce parameter all-gather overhead in MLP layers
Primary motivation for this change is correctness: incorrect sharding axes can cause OOM or numerical issues at other parallelism configs
Larger gains expected at tp=4 or tp=8 where parameter communication is a larger fraction of step time

Video Quality Comparison

Branch	Video
`main`	main.mp4
`fixbiassharding`	fixbiassharding.mp4

PSNR/SSIM (frame-by-frame, 81 frames):

Metric	Mean	Min	Max
PSNR	19.37 dB	18.83	20.17
SSIM	0.7884	0.7654	0.8043

Low PSNR/SSIM reflects floating point non-determinism from different sharding layouts across 50 denoising steps (bfloat16 + different collective patterns) — videos are visually identical.

Fix the bias sharding axis, it should be output axis instead of input one.

github-actions · 2026-04-06T10:11:10Z

e2e testgrid: https://8bcf50593faf4ea38060e236169827e5-dot-us-central1.composer.googleusercontent.com/dags/maxdiffusion_tpu_e2e/grid

Fix in FlaxAttention

Fix for ApproximateGelu and WanFeedForward too

fix gelu bias too

Change bias initialization from 'embed' to 'heads'

33164c6

Fix the bias sharding axis, it should be output axis instead of input one.

csgoogle requested a review from entrpn as a code owner April 6, 2026 10:09

csgoogle added 3 commits April 6, 2026 16:13

Update kernel and bias initialization axes in attention layer

bbdf2fe

Fix in FlaxAttention

Reorder kernel initialization parameters

d1d43a4

Fix for ApproximateGelu and WanFeedForward too

Change bias_init partitioning from 'embed' to 'mlp'

540375b

fix gelu bias too

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Change bias initialization from 'embed' to 'heads'#371

Change bias initialization from 'embed' to 'heads'#371
csgoogle wants to merge 4 commits intomainfrom
fixbiassharding

csgoogle commented Apr 6, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Apr 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

csgoogle commented Apr 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Results

Notes

Video Quality Comparison

Uh oh!

github-actions bot commented Apr 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

csgoogle commented Apr 6, 2026 •

edited

Loading