Meituan's LongCat 2.0: 1.6T parameters trained on 50,000 Chinese chips
Meituan released LongCat 2.0, a 1.6-trillion-parameter model trained on 50,000 domestic Chinese chips, marking the first large-scale LLM pretrain outside Nvidia and Google hardware.

Meituan released LongCat 2.0, a 1.6-trillion-parameter mixture-of-experts model trained entirely on 50,000 unnamed Chinese chips believed to be Huawei Ascend 910C accelerators. The company trained the model on 35 trillion tokens, including several hundred billion tokens at context lengths near one million tokens. Until now, pretraining at this scale had only been demonstrated on Nvidia GPUs and Google TPUs.
LongCat 2.0 activates 48 billion parameters per forward pass. Meituan priced API access at $0.75 per million input tokens and $3 per million output tokens. The model ran under the codename "Owl Alpha" on OpenRouter for the past two months, where performance was middling. Weights will be released soon under Apache 2.0 or MIT license, according to the company's typical practice.
Architecture highlights
- 01N-gram embeddings consume 10 percent of total parameters. LongCat routes inactive parameters not only to MoE layers but also to massive n-gram embedding tables. In the smaller LongCat Flash-Lite variant, n-gram embeddings account for nearly half of all parameters.
- 02Six-dimensional parallelism across embeddings. Meituan parallelizes the n-gram embedding layer itself, adding a sixth dimension to the training parallelism strategy on top of standard data, pipeline, tensor, expert, and sequence splits.
- 03Custom sparse attention derived from DSA. The team built a proprietary sparse attention mechanism by heavily modifying Dynamic Sparse Attention, though details on the changes remain unpublished.
- 04Million-token context in pretraining data. Several hundred billion of the 35 trillion pretraining tokens came from documents with context lengths around one million tokens, making LongCat one of the few models pretrained on ultra-long sequences rather than fine-tuned for them afterward.



