Code-as-Room converts floor plans to 3D Blender scenes via agentic synthesis
Researchers introduce an MLLM-based framework that converts top-down room images into executable Blender code, addressing instability in existing image-to-3D room generation methods.

Code-as-Room, a multi-stage MLLM framework by Yang, Luo, Gan, Hao, Lu, and Yan, generates 3D indoor scenes by translating floor-plan images into executable Blender Python code. The system parses a top-down room view to extract furniture, walls, and spatial relationships, then synthesizes geometry, materials, and lighting code in a structured pipeline. A cross-stage memory module tracks context across generation steps, solving the infinite-loop and context-forgetting problems that plague existing MLLM agents tasked with holistic room synthesis. The approach targets interior design, VR, gaming, and embodied AI applications where precise spatial control matters more than text prompts can deliver.
Rather than outputting mesh files or scene graphs, Code-as-Room represents rooms as Blender code, making edits and iterations programmatic. The preprint introduces a new benchmark for code-based 3D room synthesis with multiple evaluation protocols and comparisons against prior agent-based methods validating the execution harness design. The work was posted May 19, 2026.