Arsenic-Shahrazad-12B v4.3 GGUF quantization now available for local inference
mradermacher published GGUF-quantized weights for Arsenic-Shahrazad-12B v4.3, an uncensored 12-billion-parameter model under CC-BY-NC-4.0.
Arsenic-Shahrazad-12B v4.3 is now available in GGUF format on HuggingFace, quantized by mradermacher from lambent's base weights. The 12-billion-parameter model carries a CC-BY-NC-4.0 license and is tagged not-for-all-audiences, signaling unrestricted output capability. GGUF quantization makes the weights compatible with llama.cpp, Ollama, and other CPU/GPU inference engines that run locally without API safety layers.
At 12 billion parameters, Arsenic-Shahrazad v4.3 sits in the range practitioners typically run on consumer GPUs with 16–24 GB VRAM or on CPU with quantized precision. GGUF's quantization schemes—ranging from Q2_K to Q8_0—let users trade accuracy for memory footprint depending on hardware constraints. CC-BY-NC-4.0 permits non-commercial use, modification, and redistribution with attribution. The model supports English and is endpoints-compatible, meaning it can be served via standard inference APIs once loaded.
