-
Notifications
You must be signed in to change notification settings - Fork 548
Description
Feature Summary
Adaptive Spectral Feature Forecasting for Diffusion Sampling Acceleration
Detailed Description
https://hanjq17.github.io/Spectrum/
https://github.com/hanjq17/Spectrum
paper: https://arxiv.org/abs/2603.01623
Diffusion models have become the dominant tool for high-fidelity image and video generation, yet are critically bottlenecked by their inference speed due to the numerous iterative passes of Diffusion Transformers. To reduce the exhaustive compute, recent works resort to the feature caching and reusing scheme that skips network evaluations at selected diffusion steps by using cached features in previous steps. However, their preliminary design solely relies on local approximation, causing errors to grow rapidly with large skips and leading to degraded sample quality at high speedups.
In this work, we propose spectral diffusion feature forecaster (Spectrum), a training-free approach that enables global, long-range feature reuse with tightly controlled error. In particular, we view the latent features of the denoiser as functions over time and approximate them with Chebyshev polynomials. Specifically, we fit the coefficient for each basis via ridge regression, which is then leveraged to forecast features at multiple future diffusion steps. We theoretically reveal that our approach admits more favorable long-horizon behavior and yields an error bound that does not compound with the step size.
Extensive experiments on various state-of-the-art image and video diffusion models consistently verify the superiority of our approach. Notably, we achieve up to 4.79x speedup on FLUX.1 and 4.67x speedup on Wan2.1-14B, while maintaining much higher sample quality.
Alternatives you considered
No response
Additional context
No response