Probabilistic TRM solves Sudoku 10,000× cheaper than LLMs
A 5–7 million parameter recursive model outperforms frontier LLMs on constraint tasks by injecting Gaussian noise at inference time, without retraining.

Probabilistic Tiny Recursive Model (PTRM), introduced by Amin Sghaier, Ali Parviz, and Alexia Jolicoeur-Martineau, adds stochastic search to pretrained recursive networks at test time. The method injects Gaussian noise into hidden states at each recursion step, spawning parallel trajectories that help the model escape local minima. A pretrained Q-head classifier then selects the best path. The result: 5–7 million parameter non-autoregressive models outperform frontier LLMs on structured constraint-satisfaction tasks—Sudoku, graph coloring, path planning—at inference costs more than 10,000× lower.
The authors applied the stochastic wrapper post hoc without retraining the underlying model. For practitioners running constraint solvers or planning engines, this means skipping heavyweight LLM deployments and chain-of-thought prompting in favor of a compact recurrent architecture with controlled noise injection and a simple verifier head. The approach joins recent latent-reasoning methods like GRAM and FRM, all converging on the idea that reasoning need not happen in token space. Posted to arXiv on June 5, 2026, the paper argues that scaling compute in continuous latent space during inference is a viable alternative to scaling discrete-token autoregressive chains, with inference overhead remaining negligible on a single CPU core.



