ZenCreator

Pro-grade AI content creation. Image, video, face-swap, lipsync, and upscaling behind one API.

14 tools

Up to 4K

4.4(288)

Visit

Loading…

ReleasesResearch

DiffusionGemma-26B matches autoregressive Gemma on medical VQA, decodes 3.5× faster

A diffusion language model fine-tuned for radiology reports performs on par with its autoregressive sibling while offering bidirectional infill and faster inference.

ByAlex Sokoloff·July 3, 2026

DiffusionGemma-26B matches autoregressive Gemma on medical VQA, decodes 3.5× faster

A preprint published this week demonstrates that diffusion language models can match autoregressive performance on medical visual question answering while unlocking editing workflows that left-to-right generation cannot support. Researchers fine-tuned DiffusionGemma-26B, a mixture-of-experts diffusion model, and benchmarked it against Gemma-4-26B using an identical LoRA recipe on medical VQA datasets. Both models were scored by an LLM judge designed to handle verbosity differences.

Diffusion matched or exceeded the autoregressive baseline across all datasets. The fine-tuned model activates 3.8 billion parameters and decodes 3.5 to 4.4 times faster than its autoregressive counterpart. Despite its smaller active parameter count, the diffusion model is competitive with frontier vision-language models.

What stands out

01Bidirectional infill. Because diffusion denoises a token canvas from both directions, a radiologist can anchor fragments of a report and have the model fill the gaps between them. Autoregressive models generate left to right and handle infill poorly.
02Speed advantage. The diffusion model decodes 3.5 to 4.4 times faster than the autoregressive Gemma-4-26B, a meaningful gain for interactive drafting workflows.
03Medical VQA parity. Diffusion matched or beat autoregressive performance on every medical visual question answering dataset tested, using the same LoRA fine-tuning recipe and model size.
04Sparse activation. The 26-billion-parameter mixture-of-experts architecture activates only 3.8 billion parameters per forward pass, keeping inference costs low while maintaining competitive accuracy.
05

ZenCreator

DiffusionGemma-26B matches autoregressive Gemma on medical VQA, decodes 3.5× faster

What stands out

More in Releases

Cloud.ru launches EvoClaw managed service for OpenClaw and AI agents

Tidal stops paying royalties on AI-generated music, adds warning label July 15

Zhipu GLM-5.2 open weights rival Mythos on bug-finding, researchers say

Qwen3.6-27B abliterated weights arrive on HuggingFace

Claude Fable 5 relaunches July 1 with new cybersecurity and jailbreak detection