World model hallucination traced to data gaps, fixed with 50 trajectories
UC San Diego researchers identify three hallucination modes in generative world models and develop predictive signals that enable fine-tuning to new environments with minimal real-world data.

Generative world models produce visually plausible video rollouts, but they frequently hallucinate—drifting from ground-truth physics while maintaining visual fluency. A new preprint from Nicklas Hansen and Xiaolong Wang at UC San Diego traces this failure to data-coverage gaps and shows it is both predictable and correctable. The team trained a 350M-parameter world model on MMBench2, a 427-hour dataset of 210 visual tasks with ground-truth actions, rewards, and live simulators. They identified three distinct hallucination modes—perceptual, action-marginalized, and scene-diverging—each rooted in a different pipeline stage, and developed lightweight signals that accurately forecast failure before rollout completes.
The key finding: hallucination concentrates in low-coverage regions of the state-action space. At training time, a coverage-aware sampling technique closes these gaps. Online, the same predictive signals serve as curiosity rewards, steering data collection toward high-risk regions. The result is a data-efficient fine-tuning recipe: a pretrained world model adapts to entirely new environments with as few as 50 real trajectories. The paper was published on June 26, 2026, with an interactive version at nicklashansen.com/mmbench2.



