Gemma 3 12B abliterated multimodal merge debuts on HuggingFace
A new uncensored vision-language model combining Gemma 3 12B Instruct and GLM-4.7 Flash reasoning dropped on HuggingFace May 19, 2026.
yzkbss released an abliterated multimodal merge on HuggingFace combining Google's Gemma 3 12B Instruct with Zhipu AI's GLM-4.7 Flash reasoning architecture. The model, tagged as uncensored and abliterated, ships in safetensors format for local inference without server-side safety enforcement. It handles image-text-to-text pipelines and includes Unsloth tagging, indicating compatibility with memory-efficient fine-tuning on consumer hardware.
Abliteration strips refusal behavior from instruction-tuned models—a practice that has accelerated as base weights from major labs become more widely available. This merge pairs Gemma 3's instruction-following at 12 billion parameters with GLM-4.7 Flash's lightweight reasoning layers designed for chain-of-thought tasks. The model card omits fusion methodology, benchmark scores, context-window length, and quantization details. At upload, the checkpoint had zero downloads and zero likes.
