WizardLM-13B-Uncensored GGUF quantization enables local 13B inference on consumer GPUs
Iambackup published GGUF-quantized weights for WizardLM-13B-Uncensored, a 13-billion-parameter model fine-tuned on unfiltered instruction data, enabling local deployment without safety filters.
WizardLM-13B-Uncensored-i1-GGUF is a quantized checkpoint that packages the WizardLM-13B-Uncensored model in GGUF format for local inference. The base model, from quixiai, was fine-tuned on the 70,000-sample WizardLM Alpaca Evol Instruct dataset without content filtering. GGUF quantization compresses the 13-billion-parameter weights into smaller files that run on consumer hardware with llama.cpp, Ollama, and similar runtimes. At 4-bit precision, a 13B model typically fits in under 8 GB of VRAM, making it accessible on gaming GPUs and high-end laptops.
The model card lists an "other" license, so users should verify redistribution and commercial-use terms before deployment. The uncensored tag and unfiltered training data mean the model responds to prompts without built-in safety guardrails—a common configuration for practitioners running local LLMs in unrestricted environments. The WizardLM series originally gained attention for instruction-following capability trained through evolutionary prompting; uncensored variants strip the alignment layers that typically refuse certain requests. Users running this checkpoint should expect responses that reflect the unfiltered training data, with no server-side content moderation.







