JetBrains open-sources Mellum 12B with 2.5B active parameters for coding
JetBrains released Mellum 12B A2.5B, a 12-billion-parameter open-weight coding model using sparse activation to run 2.5 billion parameters at inference time, cutting compute costs while maintaining capacity.
JetBrains released Mellum 12B A2.5B, an open-weight coding model with 12 billion total parameters but only 2.5 billion active during inference. The sparse-activation design cuts inference compute to roughly a 3-billion-parameter dense model while retaining the capacity of a 12B architecture. Weights are available now; a technical report on arXiv details the training approach and evaluation methodology.
The model represents JetBrains' first major open-weight language model release. The sparse architecture—likely a mixture-of-experts variant—routes tokens through subsets of the full parameter space, reducing latency and memory footprint for local development environments and CI/CD pipelines without sacrificing model depth.
What stands out
- 01Sparse activation design: Only 2.5B of 12B parameters activate per token, matching inference costs of a 3B dense model while preserving 12B-scale capacity for complex coding tasks.
- 02Open weights: Full model weights released under open-source terms, enabling local deployment, fine-tuning, and integration into developer tools without API dependency.
- 03Coding focus: Positioned as a specialist for code generation and understanding, aligned with JetBrains' IDE ecosystem (IntelliJ, PyCharm, WebStorm) and practical CI/CD workflows.
- 04Transparent methodology: arXiv preprint covers training data composition, tokenization, and evaluation—standard disclosure for practitioner-focused releases.
- 05Inference efficiency: Viable for resource-constrained environments where latency and memory are bottlenecks, lowering the bar for on-device code assistance.


