ZenCreator

Pro-grade AI content creation. Image, video, face-swap, lipsync, and upscaling behind one API.

14 tools

Up to 4K

4.4(288)

Visit

Loading…

JetBrains open-sources Mellum 12B with 2.5B active parameters for coding | UncensoredHub

ReleasesNSFW

JetBrains open-sources Mellum 12B with 2.5B active parameters for coding

JetBrains released Mellum 12B A2.5B, a 12-billion-parameter open-weight coding model using sparse activation to run 2.5 billion parameters at inference time, cutting compute costs while maintaining capacity.

ByAlex Sokoloff·June 4, 2026

JetBrains open-sources Mellum 12B with 2.5B active parameters for coding

JetBrains released Mellum 12B A2.5B, an open-weight coding model with 12 billion total parameters but only 2.5 billion active during inference. The sparse-activation design cuts inference compute to roughly a 3-billion-parameter dense model while retaining the capacity of a 12B architecture. Weights are available now; a technical report on arXiv details the training approach and evaluation methodology.

The model represents JetBrains' first major open-weight language model release. The sparse architecture—likely a mixture-of-experts variant—routes tokens through subsets of the full parameter space, reducing latency and memory footprint for local development environments and CI/CD pipelines without sacrificing model depth.

What stands out

01Sparse activation design: Only 2.5B of 12B parameters activate per token, matching inference costs of a 3B dense model while preserving 12B-scale capacity for complex coding tasks.
02Open weights: Full model weights released under open-source terms, enabling local deployment, fine-tuning, and integration into developer tools without API dependency.
03Coding focus: Positioned as a specialist for code generation and understanding, aligned with JetBrains' IDE ecosystem (IntelliJ, PyCharm, WebStorm) and practical CI/CD workflows.
04Transparent methodology: arXiv preprint covers training data composition, tokenization, and evaluation—standard disclosure for practitioner-focused releases.
05Inference efficiency: Viable for resource-constrained environments where latency and memory are bottlenecks, lowering the bar for on-device code assistance.

ZenCreator

JetBrains open-sources Mellum 12B with 2.5B active parameters for coding

What stands out

More in Releases

ShortOPD cuts pruned LLM recovery time by 75% while raising generation quality 9×

Claude Design launches as Anthropic Labs visual collaboration tool

Apple accuses OpenAI of soliciting hardware prototypes in job interviews

Lightweight proxy models cut LLM post-training costs while enabling cross-model signal reuse

Colibri runs 744B GLM-5.2 on 25GB RAM by streaming experts from disk