FLUX.2 Klein 9B GGUF quantizations drop on HuggingFace in unrestricted variant

Quantized GGUF weights for Black Forest Labs' FLUX.2 Klein 9B model appear on HuggingFace in an unrestricted NSFW variant, enabling local inference on consumer hardware.

ByAlex Sokoloff·June 20, 2026

FLUX.2 Klein 9B GGUF quantizations drop on HuggingFace in unrestricted variant

xPhoenix777 uploaded GGUF-quantized weights for FLUX.2 Klein 9B to HuggingFace on June 19, tagged as unrestricted and marked not-for-all-audiences. The conversions derive from Black Forest Labs' open-weight FLUX.2 Klein 9B base model, which ships with 9 billion parameters and runs on mid-tier GPUs. GGUF quantization compresses the weights into formats that fit tighter VRAM budgets — typically 8-bit, 6-bit, or 4-bit — without requiring the full fp16 footprint. The model card lists multiple quantization levels bundled in the same upload.

FLUX.2 Klein is Black Forest Labs' smaller sibling to the flagship FLUX.1 series, designed for faster iteration and lower hardware requirements while preserving much of the original's prompt adherence and detail. Open-weight releases like Klein let practitioners fine-tune or prompt around safety filters, and GGUF conversions extend that reach to users running llama.cpp-compatible inference stacks or ComfyUI nodes that accept quantized checkpoints. The NSFW tag signals that the uploader removed or bypassed any content moderation layers present in the base release, a common move in the uncensored-model community.

Black Forest Labs' FLUX.2 Klein 9B license permits derivative works under permissive terms. The repo showed zero downloads at publication, suggesting a fresh upload. Community benchmarks comparing quant quality across bit depths should follow, and ComfyUI workflow authors are likely to wire these weights into existing FLUX pipelines — the GGUF format slots in wherever the base fp16 Klein checkpoint already runs.

ZenCreator

FLUX.2 Klein 9B GGUF quantizations drop on HuggingFace in unrestricted variant

More in Releases

Qwen3.5-122B abliterated weights debut on HuggingFace

DoRA matches LoRA accuracy while IA³ cuts training memory by 40 percent

Amazon's Strands Agents deploys LeRobot policies to real robots in minutes

ChatGPT Enterprise gains per-team spending caps and usage dashboards

GPT-5.4 powers autonomous AI chemist to optimize drug synthesis