Gemma 4 12B abliterated checkpoints surface on HuggingFace within 48 hours
Three uncensored variants of Google's Gemma 4 12B multimodal model appeared on HuggingFace this week, tagged abliterated and decensored, with GGUF quantizations for local inference.
Uncensored versions of Google's Gemma 4 12B multimodal model began appearing on HuggingFace within days of the official release. Three separate abliterated checkpoints—zaakirio/gemma-4-12b-it-uncensored, zaakirio/gemma-4-12b-it-uncensored-GGUF, and two variants from RumiaChannel (ara-Refusals2 and ara-Refusals4)—are now live, all tagged with "abliterated," "uncensored," or "decensored."
The base Gemma 4 12B is a 12-billion-parameter multimodal model that handles text, images, and audio in a unified architecture. Google released it under Apache 2.0, designed to run on consumer hardware with 16 GB VRAM or unified memory. The model works with Ollama, LM Studio, llama.cpp, MLX, vLLM, SGLang, and Unsloth.
What stands out
- 01Multiple abliteration attempts within 48 hours. zaakirio's safetensors and GGUF versions appeared June 4, followed by RumiaChannel's Refusals2 and Refusals4 variants on June 5. The rapid turnaround suggests straightforward abliteration on the official weights.
- 02GGUF quantizations ship alongside full-precision checkpoints. zaakirio's GGUF repo has four likes despite zero reported downloads, indicating early interest in quantized inference. GGUF support means llama.cpp compatibility out of the gate.
- 03Multimodal abliteration is still rare. Most uncensored releases target text-only LLMs. Gemma 4's unified image-text-audio architecture makes these among the first widely available abliterated multimodal weights at this parameter scale.
- 04Apache 2.0 license carries through. The abliterated variants inherit the permissive Apache 2.0 terms from the base model, meaning no additional restrictions on commercial use or redistribution.




