AnyFlow lets video models generate at any step count without quality loss
NVIDIA's AnyFlow technique adapts video diffusion models to run at arbitrary sampling step counts while maintaining quality. The method has been integrated into WAN 2.1, an open-weight video model supporting text-to-video, image-to-video, and video-to-video generation.
NVIDIA released AnyFlow, a flow-map-based adaptation technique that lets video diffusion models generate at any number of sampling steps without the quality degradation typical of low-step inference. The method works with both causal and bidirectional video diffusion architectures and has been integrated into WAN 2.1, an open-weight video model released this week.
AnyFlow addresses a core tradeoff in diffusion video generation: more sampling steps yield better quality but slower inference, while fewer steps speed up generation at the cost of visual fidelity. The technique uses flow maps to dynamically adjust the denoising schedule, allowing a single model to produce usable output at 10 steps or refine it further at 50 steps without retraining separate checkpoints for each step budget.
What stands out
- 01Arbitrary step budgets — Quality scales smoothly as step count increases, rather than requiring fixed step targets (e.g. 20, 50, 100) baked into the model at training time.
- 02Three generation modes in one model — Text-to-video, image-to-video, and video-to-video all run through the same AnyFlow-adapted WAN 2.1 weights, eliminating the need for separate checkpoints per task.
- 03Causal and bidirectional support — The flow-map approach works with autoregressive (causal) video models that generate frame-by-frame and bidirectional models that denoise entire clips at once.
- 04Open weights on WAN 2.1 — The AnyFlow implementation ships with WAN 2.1, an open-weight video diffusion model. Practitioners can run it locally and adjust step counts at inference time based on speed vs quality needs.
- 05
