TopoPrimer framework cuts time-series forecasting cold-start error by 27%
A new framework precomputes topological structure from persistent homology and sheaf coordinates, then feeds it token-by-token into any forecasting model—closing the cold-start gap and holding accuracy under seasonal spikes.
Researchers have introduced TopoPrimer, a framework that encodes the global topological structure of a time-series population as an explicit input to forecasting models, cutting cold-start error by 27 percent and holding accuracy within 10 percent during seasonal demand spikes that degrade classical approaches by half.
Described in an arXiv preprint posted May 15, TopoPrimer runs persistent homology and spectral sheaf coordinate extraction once per domain, then deploys the resulting topology vectors per token at inference. The authors tested it on Chronos and TimesFM backbones across four public benchmarks—Electricity (ECL), Traffic, Weather, and ETTh1—and report consistent gains: up to 7.3 percent MSE improvement on ECL, with the topology advantage holding at near-identical magnitude whether the backbone is zero-shot or fine-tuned. That suggests topology and per-series training capture complementary signals rather than overlapping ones.
Of the two components—persistent homology features and sheaf coordinates—the sheaf coordinates drive most of the accuracy lift. The framework can attach to fully-trained models or drop in as a lightweight adapter on pre-trained backbones. The gains are sharpest in difficult regimes: under peak seasonal demand, classical and zero-shot models degrade by up to 50 percent, while TopoPrimer stays within 10 percent of baseline accuracy. At cold start with no item history, TopoPrimer reduces mean absolute error by 27 percent over a topology-free baseline.
The paper frames topology as a missing context layer—precomputed once, reused across the series population, and orthogonal to the per-series patterns that training already learns. The authors note that the topology vectors are small enough to deploy per token without materially increasing inference cost, and that the precomputation step is a one-time domain-level expense.
