LeeChan-Uncensored GGUF quantized weights debut on HuggingFace

mradermacher published GGUF-quantized weights for LeeChan-Uncensored, an English conversational model with no safety filters, enabling local CPU and low-VRAM inference.

May 19, 2026

LeeChan-Uncensored GGUF quantized weights debut on HuggingFace

LeeChan-Uncensored, an English conversational model from leechanrx, now has GGUF-quantized weights available on HuggingFace courtesy of mradermacher. GGUF quantization compresses the original weights into smaller files optimized for llama.cpp, Ollama, and other local inference engines, making the model practical for consumer hardware—CPU-only setups included.

The model runs without safety restrictions. The "uncensored" label typically means the model was trained or fine-tuned without refusal phrases, content filters, or alignment guardrails—users can prompt it on any topic without automatic rejection. That makes it a fit for creative writing, role-play, research into model behavior, or any workflow where safety-tuned responses get in the way. The HuggingFace repository tags it as transformers-compatible and conversational, suggesting it's built for chat-style prompts rather than completion or instruction-following tasks.

At publication, the repository showed zero downloads and zero likes, so community feedback on quality, coherence, and actual refusal behavior isn't yet available. The model card doesn't list parameter count, context length, or benchmark numbers—details that would help practitioners choose the right quantization tier. The next step is to pull a mid-range quant (Q5_K_M or Q6_K), run a few test prompts, and report back on coherence and refusal rate. If the base model is small enough for 16 GB VRAM or 32 GB system RAM, it could fill a niche for users who want uncensored local inference without the compute overhead of 70B-class models.

More in Releases