Drael v1.1 32B Qwen3VL uncensored GGUF quantized for local inference
A quantized GGUF version of the Drael v1.1 32B multimodal model built on Qwen3VL architecture is now available on HuggingFace, enabling uncensored local deployment on consumer hardware.
Drael v1.1 32B Qwen3VL uncensored is a multimodal model from kyozen-sys, now available in GGUF quantized format on HuggingFace, packaged by mradermacher. The 32-billion-parameter model runs locally without safety restrictions, making it practical for practitioners who need vision-language capabilities on consumer GPUs.
The GGUF format enables the model to run on machines with limited VRAM through aggressive quantization. The base model supports conversational interactions and is tagged for English-language use, though multimodal models in this family typically handle both text and image inputs.
Quantization and deployment
The HuggingFace repository hosts multiple quantization levels, letting users trade off between model quality and memory footprint. GGUF files are compatible with llama.cpp and other local inference engines that support the format. The model is marked as endpoints-compatible and hosted in the US region.
The uncensored designation means the model has no built-in content filtering—a standard configuration for open-weight releases intended for research and local deployment. Released on May 19, 2026, the model is available for immediate download.
