Qwopus3.5-122B abliterated weights land in three formats on HuggingFace

OpenYourMind released uncensored weights for a 122B Qwen3.5 mixture-of-experts model this week, shipping safetensors, GGUF, and MLX-4bit variants within hours.

May 17, 2026

Qwopus3.5-122B abliterated weights land in three formats on HuggingFace

A 122-billion-parameter abliterated Qwen3.5 mixture-of-experts model landed on HuggingFace on May 13, packaged for three different inference stacks. OpenYourMind's Qwopus3.5-122B-A10B-abliterated-uncensored is a multimodal model—it accepts image-text-to-text inputs—and the creator shipped safetensors, GGUF, and MLX-4bit variants within a four-hour window.

The base safetensors checkpoint supports transformers pipelines and carries the full 122B parameter count across the mixture-of-experts architecture. The GGUF variant strips the vision tower and runs text-only, targeting llama.cpp and koboldcpp users who need CPU or mixed offload. The MLX-4bit build keeps multimodal capability and quantizes to 4-bit for Apple Silicon inference.

Abliteration is a weight-editing technique that removes refusal behavior without full retraining. The model cards tag the release as "abliterated" and "uncensored," signaling no server-side safety layer. After 24 hours, the GGUF variant had 83 downloads, the base safetensors 23, and the MLX quant 13. All three repos carry the same Qwen3.5 MoE backbone and the same abliterated tuning; the format split lets practitioners pick the inference stack that fits their hardware.

Qwen3.5 from Alibaba's Qwen team has become a popular base for community fine-tunes since the open-weight release earlier this year. Mixture-of-experts architectures activate only a subset of parameters per token, reducing inference cost compared to dense models of similar capability. A 122B MoE typically activates 10–20 billion parameters per forward pass, making it runnable on consumer hardware when quantized. The three-format strategy—full-precision safetensors for GPU clusters, GGUF for CPU/mixed setups, MLX for Mac users—mirrors the distribution pattern seen with other large open-weight releases, and the GGUF repo's higher download count suggests llama.cpp remains the most common inference backend for models in this weight class.

More in Releases