Warp-as-History unlocks zero-shot camera control in frozen video models
Researchers propose a training-free method that repurposes a video model's history pathway to follow camera trajectories, then fine-tunes on one annotated video for generalization.

Warp-as-History, a technique from Yifan Wang and Tong He, enables camera-controlled video generation without training, architectural changes, or test-time optimization. The method feeds camera-warped pseudo-history into a frozen video model's visual-history pathway, treating camera motion as a history-warping operation the model already understands. Given a target camera trajectory, the system warps earlier frames according to prescribed viewpoint changes, aligns their positional encoding with target frames being denoised, filters out tokens lacking valid source observations, then routes that warped history through the model's existing temporal pathway as if it were real observed frames.
Existing camera-control methods typically require post-training on large-scale camera-annotated datasets or shift computational cost to test-time guidance and per-video optimization. Warp-as-History sidesteps both bottlenecks by revealing what the authors describe as a non-trivial zero-shot capability in frozen video models. A lightweight LoRA fine-tune on a single camera-annotated video further improves camera adherence, visual quality, and motion dynamics while generalizing to unseen videos without per-target adaptation or test-time optimization. The preprint, posted to HuggingFace Papers in May 2026, reports extensive experiments across diverse datasets confirming the approach.