Spherical geometry fixes latent diffusion path inefficiency
Researchers replace linear interpolation with geodesic paths in latent diffusion, improving ImageNet-256 FID without architectural changes.

A new preprint from Boğaziçi University and Koç University identifies and fixes a geometric flaw in latent diffusion models. Standard flow matching transports Gaussian noise to VAE latents along straight Euclidean paths, but both endpoints concentrate in thin spherical shells—and a linear chord cuts through empty space between them. By decomposing each latent token into radial and angular components, the authors find that decoded perceptual and semantic content lives almost entirely in direction, with radius contributing little.
They project data latents onto a fixed token radius, use the radial projection of Gaussian noise as the spherical prior, and replace linear interpolation with spherical linear interpolation (slerp). The resulting geodesic paths stay on the sphere at every timestep, with velocity targets purely angular by construction. Under matched training, the method consistently improves class-conditional ImageNet-256 FID across different image tokenizers. The approach leaves the diffusion architecture unchanged, requires no auxiliary encoder or representation-alignment objective, and introduces only a geometric correction to the interpolation scheme. The preprint was posted May 15, 2026.