LLM trading agents show embedding drift before portfolio collapse
Researchers mapped embedding trajectories in LLM trading agents and found measurable pre-failure signatures—planning embeddings drift, effective rank contracts—across 80 rolling failure anchors and eight model runs.
A new preprint reveals that large language model trading agents exhibit measurable embedding drift before portfolio drawdowns occur. Researchers used TradeArena, an auditable trading testbed with risk reports and execution simulation, to analyze 80 rolling failure anchors across eight LLM trajectories. They found that planning embeddings drift from normal-state centroids and effective rank contracts before failures—a pattern that persists across hash, LSA, Transformer, and white-box hidden-state probes.
Stress tests with chain-of-thought-free weights, lexical controls, OHLCV noise, and false audit reports revealed that rationale-level contraction can vanish without explicit reasoning, while intent-space contraction may remain. Structured risk feedback acted as an external alignment signal without fine-tuning, but proved inconsistent: true audit feedback improved calibration for some models and return-drawdown metrics for others, while hidden or placebo feedback sometimes produced higher short-horizon returns with weaker alignment diagnostics. A 51-stock intraday experiment exposed a blind spot: LLM rationales often justify concentrated exposure to coupled assets that the risk layer repeatedly clips, with a rolling Markowitz baseline as a covariance reference.
The authors frame their work as a research claim rather than a profitability claim, arguing that auditable risk feedback and representation trajectories reveal when LLM financial reasoning is aligning, drifting, or failing.



