ComfyUI job times climb 2.7× over successive runs; restart restores speed
A ComfyUI user running Wan2.2 workflows reports job completion times climbing from 900 seconds to 2500 seconds across successive runs, with speed restored only after restarting the application.
ComfyUI users running memory-intensive workflows are reporting progressive slowdowns that disappear only when the application is restarted. One practitioner working with Wan2.2 workflows documented job times climbing from 900 seconds on first boot to 2500 seconds after multiple runs, with the pattern repeating across sessions.
The behavior suggests VRAM exhaustion rather than a classic memory leak. The user's system shows Python holding 11GB of dedicated VRAM and 5–20GB of shared RAM, with ComfyUI itself using 0.5–1.2GB of VRAM—a configuration that maxes out available dedicated memory. When dedicated VRAM fills, the system falls back to shared system RAM, which is substantially slower for GPU-accelerated inference. As successive jobs push more data into shared memory, job times stretch from the initial 900-second baseline to 2500 seconds without intervention.
ComfyUI's built-in "cleanup VRAM usage" right-click option has not reliably restored speed in this case. Restarting the application clears the slowdown immediately, returning job times to the 900-second baseline for the next 10+ runs. The pattern is consistent with fragmented VRAM allocation that ComfyUI's cleanup routine does not fully reclaim—a known challenge in long-running inference sessions with large multimodal models like Wan2.2, an open-weight video diffusion model from Kuaishou that supports high-resolution synthesis and runs locally without API restrictions.
Wan2.2's VRAM footprint makes it a stress test for workflow managers like ComfyUI, which must juggle multiple model components—text encoders, VAEs, diffusion U-Nets—across limited GPU memory. Users running similar workflows may see the same slowdown pattern and should monitor VRAM allocation across job queues.
