HiDream-O1-Image 8B generates pixels directly, skips VAE entirely
Vivago.ai released HiDream-O1-Image, an 8B pixel-space model with no VAE or latent compression, shipping dev and standard checkpoints plus a reasoning-driven prompt agent.
HiDream-O1-Image is an 8B pixel-space image generator from Vivago.ai that skips the VAE entirely, working directly in pixel space without latent compression or decoder artifacts. The model ships in two versions: a dev checkpoint that runs 28 inference steps and a standard build at 50 steps, both supporting resolutions up to 2048px and an inpainting mode. Vivago also released a distilled variant and a reasoning-driven prompt agent that can wire into any OpenAI-compatible endpoint or run locally on Gemma-4-31B-it.
The model circulated anonymously as "Peanut" on the Z-Image arena before Vivago confirmed the identity this week, where it had outscored Qwen and other open generators. HuggingFace cards for both the dev build and the standard checkpoint went live alongside a GitHub repo that includes a WebUI. A community-built FP8 quantization for 16GB VRAM and third-party ComfyUI nodes appeared within hours; Fal.ai already hosts inference endpoints for both versions and the edit mode.
Early testers report fast LoRA training and efficient pixel-space usage, though some note soft texture rendering and skin-tone issues in first outputs. The official HuggingFace Space demo was overloaded at publication. Native ComfyUI support is expected in the coming days, and a 10GB Pinokio launcher is already available. Whether the no-VAE architecture holds up under fine-tuning at scale and whether the reasoning agent delivers meaningful prompt gains remain open questions as the community digs into the weights.
