Run these locally with Ollama, LM Studio, llama.cpp, or text-generation-webui. Ranked by measured uncensored score — how often the model complies with adult/controversial requests without refusal.
by cognitivecomputations
MoE 46.7B total / 12.9B active. Uncensored Mixtral fine-tune by Cognitive Computations. Fast inference thanks to MoE sparsity.
by Google
Google's open-weights 27B. Heavily aligned but strong baseline. Good for coding and reasoning, mediocre for RP unless jailbroken.
by cognitivecomputations
Uncensored 70B Llama 3 fine-tune. Best-in-class for uncensored long-form response at Q4 on 48GB VRAM.
by Meta
Meta's latest 70B. Claude-3 Sonnet-level quality at open-weights. Moderately aligned — useful as refusal baseline for comparisons.
by mistralai
Mistral/Nvidia collab. 12B with 128K context, good multilingual performance. Reasonable alignment — usable with system-prompt jailbreaks.
by cognitivecomputations
Uncensored 8B Llama 3 fine-tune. Great for running on laptops or 6-8GB VRAM — no-filter creative writing and RP.
by Alibaba
Alibaba's 32B with 128K context. Strong multilingual, good coding, moderate alignment — easy to uncensor via system prompt.
by NousResearch
Nous Research flagship. Strong reasoning + roleplay, lightly-censored. Llama 3.1 70B backbone with tool-calling.
by mistralai
Mistral AI's 24B dense model. Fits on a single RTX 4090 at Q4 with 32K context. Lightly-filtered with easy system-prompt bypass.
by NousResearch
Nous's compact Hermes. Excellent structured output, function calling, and lightly-aligned tone.
by anthracite-org
Anthracite's 22B uncensored creative/RP fine-tune on Mistral-Small base. Prose-focused, excellent for character roleplay.
by TheDrummer
Drummer's Mistral Nemo uncensored fine-tune for RP and creative. Strong prose, holds character well. 12GB VRAM sweet spot.
by Sao10K
Sao10K's 8B uncensored roleplay specialist. Llama 3 base. Beloved in the SillyTavern community for character consistency.
by Microsoft
Microsoft's 14B reasoning-focused model. Strong math/logic. Very aligned — low uncensored score but useful for benchmarks.
by CohereForAI
Cohere's 35B with strong tool-use + RAG. 128K context. Permissive but not uncensored by default.
by Gryphe
Gryphe's classic uncensored fiction/RP model. Llama 2 base — older but legendary for its creative output. Low VRAM footprint.
by Meta
Meta's compact flagship. 128K context, strong instruction following. Heavily aligned — pair with uncensored fine-tunes for RP.
by Sao10K
Sao10K's Solar-10.7B uncensored creative writing model. Classic RP workhorse, still very popular for 12GB rigs.
by Alibaba
Fast and capable 7B baseline. 32K context, strong coding and tool-use for size. Runs on 8GB VRAM.
by alpindale
Llama 2 70B + 70B frankenmerge by alpindale. Legendary long-context RP model. Requires ~80GB VRAM at Q4. For enthusiasts with dual 4090s.
by haotian-liu
Vision-language model. See-and-describe images, OCR, visual reasoning. Works with Ollama's multimodal API.
by DeepSeek
671B MoE (37B active). Matches frontier closed models on coding and reasoning. Requires ~400GB to host locally; mostly API-consumed.