Grok Imagine Video 1.5 exits preview with 720p generation and native audio
xAI's Grok Imagine Video 1.5 exits preview with 720p video generation, built-in audio synthesis, and a Fast variant that renders in 25 seconds.

xAI's Grok Imagine Video 1.5 left preview this week, generating 480p or 720p video clips up to 15 seconds long from a static image and text prompt. The model renders at 24fps with native audio synthesis—sound effects, background ambience, and lip-synced dialogue bake directly into the output. A Fast variant completes 720p generation in 25 seconds, trading marginal quality for speed.
The model supports 16:9, 9:16, 1:1, and other aspect ratios. An Extend from Frame feature lets users pick any finished frame and continue the sequence forward—xAI's marketing claims "30-second video" by chaining these passes, though the API documentation confirms the base limit is 15 seconds per generation. Spatial audio shifts as objects move through the frame, and microexpression control renders subtle facial movements to match emotional tone. Both standard and Fast versions are available through the xAI API and web interface. Unlike Runway, Pika, and Stability AI's comparable models, Grok Imagine Video 1.5 synthesizes audio natively instead of requiring separate post-production mixing.



