Gemma 4 12B uncensored merge optimized for AMD ROCM drops on HuggingFace
A quantized 12B-parameter Gemma 4 merge combining agentic, composer, and uncensored fine-tunes has been released in GGUF format optimized for AMD ROCM hardware.
rcmorano has released llmfan46-gemma-4-12B-agentic-fable5-composer2.5-v2-3.5x-tau2-uncensored-heretic-ROCMFPX, a 12-billion-parameter merge based on Google's Gemma 4 architecture, quantized to GGUF format and optimized for AMD ROCM-compatible GPUs. The model stacks task-oriented, creative-writing, and abliterated safety layers into a single checkpoint targeting Radeon hardware rather than NVIDIA CUDA stacks.
The merge combines three fine-tune dimensions: agentic task execution and multi-step reasoning, creative writing and story generation via the fable5 and composer2.5 layers, and unrestricted instruction following. The "heretic" tag signals removal of refusal behavior — a common pattern in the open-weight community where practitioners strip safety guardrails to enable unrestricted prompting. The "3.5x-tau2" notation likely references a merge ratio or temperature parameter, though the model card does not detail the combination recipe.
GGUF quantization makes the 12B checkpoint runnable on consumer GPUs with 16–24GB VRAM, a practical threshold for hobbyists and small studios. The ROCM build is notable in a landscape dominated by CUDA tooling — AMD's open-source compute stack has gained traction among open-weight practitioners seeking alternatives to NVIDIA's closed ecosystem, particularly as Radeon VII, RX 6000, and RX 7000 series cards offer competitive VRAM-per-dollar ratios for inference workloads. The model card lists conversational use as the primary application, suggesting chat and assistant scenarios rather than pure code or reasoning tasks. At release, the model had logged zero downloads and one like, typical for fresh uploads in a crowded registry where discovery depends on word-of-mouth and benchmark leaderboards.





