Loading…

Lite3R cuts 3D reconstruction latency by 2× with sparse attention and FP8 training | UncensoredHub

ReleasesResearchNSFW

Lite3R cuts 3D reconstruction latency by 2× with sparse attention and FP8 training

Lite3R is a model-agnostic framework that replaces dense multi-view attention with sparse linear attention and FP8-aware quantization-aware training, delivering 1.7-2.0× latency reduction and 1.9-2.4× memory savings on BlendedMVS and DTU64 benchmarks while preserving reconstruction quality.

May 13, 2026

Lite3R cuts 3D reconstruction latency by 2× with sparse attention and FP8 training

Lite3R is a model-agnostic teacher-student framework that addresses efficiency bottlenecks in transformer-based 3D reconstruction. The framework replaces dense multi-view attention with Sparse Linear Attention to preserve geometric interactions while cutting token-mixing overhead, and introduces a parameter-efficient FP8-aware quantization-aware training (FP8-aware QAT) strategy that freezes pretrained backbone parameters and trains only lightweight linear-branch projection layers. The result is stable low-precision deployment that retains pretrained geometric priors without destabilizing geometry-sensitive representations.

Transformer-based 3D reconstruction models have proven effective at recovering geometry and appearance from multi-view observations, but scaling to larger backbones and higher-resolution inputs creates two coupled challenges. Dense multi-view attention generates substantial computational overhead, and low-precision execution can destabilize depth, pose, and 3D consistency. Lite3R's sparse attention mechanism and partial attention distillation strategy tackle both problems simultaneously, enabling practical deployment of large-scale 3D reconstruction pipelines.

On benchmarks

The authors evaluated Lite3R on two representative backbones—VGGT and DA3-Large—across BlendedMVS and DTU64 datasets. The framework delivered 1.7-2.0× latency reduction and 1.9-2.4× memory savings compared to dense attention baselines while preserving competitive reconstruction quality overall. The FP8-aware QAT approach trains only a small fraction of model parameters, making it parameter-efficient and suitable for fine-tuning pretrained models without full retraining.

Code and model weights are available on GitHub, with additional details on the project website.

On benchmarks

More in Releases