MiniMax M2.7 abliterated 4-bit MLX weights arrive on HuggingFace

A 4-bit quantized MLX port of MiniMax M2.7 with removed safety filters appeared on HuggingFace this week, targeting Apple Silicon users running uncensored local inference.

May 17, 2026

MiniMax M2.7 abliterated 4-bit MLX weights arrive on HuggingFace

MiniMax-M2.7-BF16-ultra-uncensored-heretic-mlx-4Bit, posted to HuggingFace on May 15 by cookietimeh, is a 4-bit quantized MLX conversion of the MiniMax M2.7 language model with abliterated safety guardrails. MLX is Apple's machine learning framework optimized for M-series chips, making this a Mac-native option for practitioners running unrestricted text generation locally.

The release joins a growing catalog of abliterated open-weight models on HuggingFace, where safety-tuned instruction layers are surgically removed to restore base-model behavior. Abliteration typically targets the residual stream activations that encode refusal responses, leaving the underlying language capabilities intact. The "heretic" tag suggests an aggressive pass, though the model card provides no methodology, training details, or benchmark comparisons.

MiniMax M2.7 is a multilingual model with strong Chinese-language support, developed by the Beijing-based MiniMax startup. The BF16 designation in the repo name indicates the original weights were in bfloat16 precision before quantization to 4-bit for MLX. A 4-bit quantization of a model in the 7B–20B range typically fits comfortably in the unified memory of an M1 Max or higher, making this port accessible to users with 32GB+ machines.

The model card lists the pipeline as "text-generation" and uses the Transformers library with safetensors weight format. No usage examples, prompt templates, or sample outputs are included. Practitioners can pull the weights directly via the HuggingFace CLI or MLX community tooling.

More in Releases