10Eros fine-tune hits 10-minute clips on RTX 5070 Ti; quantized version still missing
A ComfyUI user reports the 10Eros fine-tune of LTX-2.3 requires full CPU-to-GPU weight streaming on 16GB cards, with no quantized version available yet.
The 10Eros fine-tune of Lightricks' LTX-2.3 video model is hitting a hard wall on 16GB cards. A ComfyUI user running the LikenessGuideHelper I2V v3.2 workflow on an RTX 5070 Ti reports 10-minute generation times for 19-second clips at 1000×1744 resolution, with the entire 29GB fp8 checkpoint offloaded to system RAM. ComfyUI logs show zero megabytes actually loaded on the GPU and 1,660 lowvram patches applied per step, forcing every weight access through async CPU-to-GPU streaming. The first 13-step pass alone takes 4 minutes 15 seconds, with tiled upscale adding another 2 minutes.
The base LTX-2.3 NVFP4 checkpoint from Lightricks would fit in 16GB VRAM and likely halve generation time, but users choose 10Eros specifically for its fine-tune quality over the official release. No quantized NVFP4 or NF4 version of the 10Eros weights has surfaced yet. The user has already enabled sage attention, fp8 matrix multiplication, three async offload streams, pinned memory on 55GB of system RAM, mmap loading, and channels-last layout in ComfyUI 0.21.1 with PyTorch 2.11+cu130. The 10Eros checkpoint ships in fp8 mixed precision and remains too large for single-pass VRAM loading on any consumer card below 24GB.
