Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions .ai/AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,10 @@ Strive to write code as simple and explicit as possible.
- Use `self.progress_bar(timesteps)` for progress tracking
- Don't subclass an existing pipeline for a variant — DO NOT use an existing pipeline class (e.g., `FluxPipeline`) to override another pipeline (e.g., `FluxImg2ImgPipeline`) which will be a part of the core codebase (`src`)

### Modular Pipelines

- See [modular.md](modular.md) for modular pipeline conventions, patterns, and gotchas.

## Skills

Task-specific guides live in `.ai/skills/` and are loaded on demand by AI agents. Available skills include:
Expand Down
47 changes: 34 additions & 13 deletions ...s/model-integration/modular-conversion.md → .ai/modular.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,6 @@
# Modular Pipeline Conversion Reference
# Modular pipeline conventions and rules

## When to use

Modular pipelines break a monolithic `__call__` into composable blocks. Convert when:
- The model supports multiple workflows (T2V, I2V, V2V, etc.)
- Users need to swap guidance strategies (CFG, CFG-Zero*, PAG)
- You want to share blocks across pipeline variants
Shared reference for modular pipeline conventions, patterns, and gotchas.

## File structure

Expand All @@ -14,7 +9,7 @@ src/diffusers/modular_pipelines/<model>/
__init__.py # Lazy imports
modular_pipeline.py # Pipeline class (tiny, mostly config)
encoders.py # Text encoder + image/video VAE encoder blocks
before_denoise.py # Pre-denoise setup blocks
before_denoise.py # Pre-denoise setup blocks (timesteps, latent prep, noise)
denoise.py # The denoising loop blocks
decoders.py # VAE decode block
modular_blocks_<model>.py # Block assembly (AutoBlocks)
Expand Down Expand Up @@ -81,15 +76,21 @@ for i, t in enumerate(timesteps):
latents = components.scheduler.step(noise_pred, t, latents, generator=generator)[0]
```

## Key pattern: Chunk loops for video models
## Key pattern: Denoising loop

All models use `LoopSequentialPipelineBlocks` for the denoising loop (iterating over timesteps):
```python
class MyModelDenoiseLoopWrapper(LoopSequentialPipelineBlocks):
block_classes = [LoopBeforeDenoiser, LoopDenoiser, LoopAfterDenoiser]
```

Use `LoopSequentialPipelineBlocks` for outer loop:
Autoregressive video models (e.g. Helios) also use it for an outer chunk loop:
```python
class ChunkDenoiseStep(LoopSequentialPipelineBlocks):
block_classes = [PrepareChunkStep, NoiseGenStep, DenoiseInnerStep, UpdateStep]
class HeliosChunkDenoiseStep(LoopSequentialPipelineBlocks):
block_classes = [ChunkHistorySlice, ChunkNoiseGen, ChunkDenoiseInner, ChunkUpdate]
```

Note: blocks inside `LoopSequentialPipelineBlocks` receive `(components, block_state, k)` where `k` is the loop iteration index.
Note: sub-blocks inside `LoopSequentialPipelineBlocks` receive `(components, block_state, i, t)` for denoise loops or `(components, block_state, k)` for chunk loops.

## Key pattern: Workflow selection

Expand Down Expand Up @@ -136,6 +137,26 @@ ComponentSpec(
)
```

## Gotchas

1. **Importing from standard pipelines.** The modular and standard pipeline systems are parallel — modular blocks must not import from `diffusers.pipelines.*`. For shared utility methods (e.g. `_pack_latents`, `retrieve_timesteps`), either redefine as standalone functions or use `# Copied from diffusers.pipelines.<model>...` headers. See `wan/before_denoise.py` and `helios/before_denoise.py` for examples.

2. **Cross-importing between modular pipelines.** Don't import utilities from another model's modular pipeline (e.g. SD3 importing from `qwenimage.inputs`). If a utility is shared, move it to `modular_pipeline_utils.py` or copy it with a `# Copied from` header.

3. **Accepting `guidance_scale` as a pipeline input.** Users configure the guider separately (see [guider docs](https://huggingface.co/docs/diffusers/main/en/api/guiders)). Different guider types have different parameters; forwarding them through the pipeline doesn't scale. Don't manually set `components.guider.guidance_scale = ...` inside blocks. Same applies to computing `do_classifier_free_guidance` — that logic belongs in the guider.

4. **Accepting pre-computed outputs as inputs to skip encoding.** In standard pipelines we accept `prompt_embeds`, `negative_prompt_embeds`, `image_latents`, etc. so users can skip encoding steps. In modular pipelines this is unnecessary — users just pop out the encoder block and run it separately. Encoder blocks should only accept raw inputs (`prompt`, `image`, etc.).

5. **VAE encoding inside prepare-latents.** Image encoding should be its own block in `encoders.py` (e.g. `MyModelVaeEncoderStep`). The prepare-latents block should accept `image_latents`, not raw images. This lets users run encoding standalone. See `WanVaeEncoderStep` for reference.

6. **Instantiating components inline.** If a class like `VideoProcessor` is needed, register it as a `ComponentSpec` and access via `components.video_processor`. Don't create new instances inside block `__call__`.

7. **Deeply nested block structure.** Prefer flat sequences over nesting Auto blocks inside Sequential blocks inside Auto blocks. Put the `Auto` selection at the top level and make each workflow variant a flat `InsertableDict` of leaf blocks. See `flux2/modular_blocks_flux2_klein.py` for the pattern.

8. **Using `InputParam.template()` / `OutputParam.template()` when semantics don't match.** Templates carry predefined descriptions — e.g. the `"latents"` output template means "Denoised latents". Don't use it for initial noisy latents from a prepare-latents step. Use a plain `InputParam(...)` / `OutputParam(...)` with an accurate description instead.

9. **Test model paths pointing to contributor repos.** Tiny test models must live under `hf-internal-testing/`, not personal repos like `username/tiny-model`. Move the model before merge.

## Conversion checklist

- [ ] Read original pipeline's `__call__` end-to-end, map stages
Expand Down
1 change: 1 addition & 0 deletions .ai/review-rules.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ Review-specific rules for Claude. Focus on correctness — style is handled by r
Before reviewing, read and apply the guidelines in:
- [AGENTS.md](AGENTS.md) — coding style, copied code
- [models.md](models.md) — model conventions, attention pattern, implementation rules, dependencies, gotchas
- [modular.md](modular.md) — modular pipeline conventions, patterns, common mistakes
- [skills/parity-testing/SKILL.md](skills/parity-testing/SKILL.md) — testing rules, comparison utilities
- [skills/parity-testing/pitfalls.md](skills/parity-testing/pitfalls.md) — known pitfalls (dtype mismatches, config assumptions, etc.)

Expand Down
2 changes: 1 addition & 1 deletion .ai/skills/model-integration/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -82,7 +82,7 @@ See [../../models.md](../../models.md) for the attention pattern, implementation

## Modular Pipeline Conversion

See [modular-conversion.md](modular-conversion.md) for the full guide on converting standard pipelines to modular format, including block types, build order, guider abstraction, and conversion checklist.
See [modular.md](../../modular.md) for the full guide on modular pipeline conventions, block types, build order, guider abstraction, gotchas, and conversion checklist.

---

Expand Down