Russian animator generates 2D music video entirely in ComfyUI Cloud using Flux, Nano Banana, and Seedance
Savva Zhuravlev created a full music video for Venya D'rkin's "O Dushe" using ComfyUI Cloud's 96GB GPU tier, layering Flux 1 dev backgrounds, Nano Banana character frames, and Seedance motion interpolation to work around local hardware limits.
Savva Zhuravlev released a music video this week for Russian singer Venya D'rkin's track "O Dushe," generated entirely in ComfyUI using a mix of open-weight image and motion models. Zhuravlev, who studied traditional frame-by-frame animation at art school, turned to AI generation after his local 8GB VRAM setup proved insufficient. He rented ComfyUI Cloud's 96GB GPU tier to run the workflow, noting the cost matched Runway's API pricing but offered significantly more control over the pipeline.
The video combines Flux 1 dev for background illustration, Nano Banana for character frames, and Seedance for motion interpolation. Zhuravlev said Flux 1 dev produced "incredibly illustrative" backgrounds when it worked but struggled with style transfer via IP-Adapter and reference images, forcing him to route frames through other models. Nano Banana handled character consistency better than Flux for his 2D animation aesthetic, though both models introduced artifacts. Gemini 3 Pro preserved style most reliably across keyframes, he said, with fewer spontaneous changes than other models—though it still added volume and painterly texture when given complex poses.
Pipeline and model trade-offs
Zhuravlev generated keyframes in Flux 1 dev and Nano Banana, then used Qwen image and occasionally Nano Banana again for multi-angle shots of the same location. Seedance animated the motion between frames, simplifying the process despite occasional "plastic" deformation. He noted ComfyUI Cloud strips many custom nodes available in local installs but still beat Higgsfield for flexibility at similar cost. The final edit was assembled in traditional video software; all frames came from AI generation.
For his next project, Zhuravlev said he'd train a LoRA on Flux or a flexible SDXL checkpoint for tighter character control and test Wan for animation instead of Seedance. He didn't disclose total render time but said much of the work was learning the pipeline rather than optimized production.







