GPT-5.6 Sol reaches 750 tokens per second on Cerebras deployment
OpenAI announced GPT-5.6 in three tiers—Sol, Terra, Luna—with Sol deploying on Cerebras at 750 tokens per second next month. Pricing ranges from $1/$6 (Luna) to $5/$30 (Sol) per million tokens; availability is currently limited to US partners pending government clearance.

OpenAI announced GPT-5.6 in three configurations: Sol, Terra, and Luna. Sol is the new flagship, Terra sits roughly at GPT-5.5 performance, and Luna slightly below GPT-5.4. Pricing runs $5 input / $30 output per million tokens for Sol, $2.50 / $15 for Terra, and $1 / $6 for Luna. Token caching is now a paid feature—cached prompts cost 25 percent more to send, though cached tokens themselves still carry a 90 percent discount on retrieval.
The standout detail is speed: GPT-5.6 Sol will deploy on Cerebras infrastructure next month at 750 tokens per second. OpenAI has not yet disclosed Cerebras pricing or confirmed whether Terra and Luna will also run on the same hardware. Benchmark scores for Sol trail the recently restricted Fable model, though OpenAI has not published full eval tables.
For now the model is available only to a limited set of US partners while OpenAI works through government agreements. General availability is promised in a few weeks. Watch for Terra and Luna Cerebras deployment details, full benchmark breakdowns, and clarity on what "government agreements" actually cover—export controls, compute caps, or something else entirely.



