Pion optimizer preserves singular values during LLM training via orthogonal transformations
Researchers introduce Pion, an optimizer that updates weight matrices through orthogonal transformations instead of additive steps, keeping singular values fixed while modulating geometry.

Pion is a spectrum-preserving optimizer for large language model training, introduced in a preprint released May 13. Unlike Adam, AdamW, or the recently proposed Muon optimizer — all of which update weights by adding scaled gradients — Pion applies left and right orthogonal transformations to each weight matrix, leaving its singular values unchanged throughout training.
The paper argues that this approach offers a fundamentally different optimization mechanism: instead of shifting weights in parameter space, Pion rotates and reflects them while holding their spectral norm constant. The authors derive the update rule from orthogonal equivalence transformations, a concept borrowed from numerical linear algebra, and show that the method converges under standard assumptions.
Benchmarks and open questions
Experiments on both LLM pretraining and finetuning tasks show Pion matching or exceeding the performance of Adam and Muon across several benchmarks. The authors report stable training dynamics and competitive final loss. However, the preprint does not yet specify hyperparameter sensitivity or wall-clock speed comparisons, leaving open questions about practical deployment at scale.
Authored by Kexuan Shi, Hanxuan Li, Zeju Qiu, Yandong Wen, Simon Buchholz, and Weiyang Liu, the full preprint is available on arXiv. No code or reference implementation has been released yet.