Arsenic-Shahrazad 12B v4.1 GGUF quantizations land on HuggingFace
mradermacher published GGUF quantizations of lambent/arsenic-shahrazad-12b-v4.1 under CC-BY-NC-4.0, making the uncensored 12-billion-parameter model runnable on consumer hardware.
Arsenic-Shahrazad 12B v4.1, an uncensored language model from lambent, is now available in GGUF format on HuggingFace. The quantized weights carry a CC-BY-NC-4.0 license, permitting non-commercial use and modification. At 12 billion parameters, the model fits comfortably on mid-range consumer GPUs—quantization brings memory requirements down far enough that 16GB VRAM setups can handle it without strain.
The base model is tagged not-for-all-audiences, signaling no built-in content filtering. Practitioners running ComfyUI, Ollama, or llama.cpp can drop the GGUF files straight into their workflows without additional conversion. The quantizer's repo lists multiple precision tiers, so users can trade quality for speed depending on their hardware. English is the primary language, and the model is compatible with standard transformer endpoints.
GGUF has become the de facto distribution format for local LLM inference, powering everything from terminal chat clients to graphical front-ends like Oobabooga and KoboldCpp. For practitioners who prefer not to wrestle with safetensors-to-GGUF conversion scripts, mradermacher's quantization work removes a friction point—download, point your inference engine at the file, and run. The Shahrazad lineage has circulated in uncensored-AI circles for several months, though public documentation on the v4.1 changes remains sparse. The model card does not include a changelog, benchmark comparisons, or training details beyond the base model reference. Community feedback on whether the v4.1 iteration addresses quirks from earlier Shahrazad releases will likely surface over the coming weeks—if lambent publishes a formal changelog, it would clarify whether this is a minor patch or a more substantial retrain.
