Query-level LLM profiles outperform flat summaries in routing, RouteProfile finds
A new preprint from Stanford and collaborators shows that structured, query-level LLM profiles outperform flat domain summaries across three router architectures, with the biggest gains when generalizing to newly added models.

Structured profiles that capture model behavior at the query level beat coarse domain summaries when routing requests across multiple large language models, according to a preprint released May 15, 2026.
RouteProfile, authored by Jingjun Xu, Hongji Pu, Tao Feng, Haozhen Zhang, Jiaxuan You, and Ge Liu, treats LLM profiling as a structured information integration problem over heterogeneous interaction histories. The framework maps four key design dimensions: organizational form (flat vs. hierarchical), representation type (symbolic vs. embedding-based), aggregation depth (domain-level vs. query-level), and learning configuration (fixed vs. trainable). The team evaluated these choices across three representative routers under both standard and new-LLM generalization settings.
Structured profiles consistently outperformed flat ones across all tested routers. Query-level signals proved more reliable than coarse domain-level signals for routing decisions. When generalizing to newly introduced models—a critical real-world scenario—structured profiles under trainable configurations showed the largest performance gains. The authors argue that profile design has been underexplored relative to router mechanism design, and that clarifying the role of profiles enables fairer comparison and more principled development of routing systems.
The preprint is available on HuggingFace Papers.