PaSaMaster outperforms GPT-5.2 on academic search at 1% compute cost
A self-evolving agentic retrieval system from SJTU researchers outperforms frontier LLMs on academic search with zero hallucination at 1% of the compute cost.

PaSaMaster, a self-evolving agentic literature retrieval system from Shanghai Jiao Tong University researchers, produces relevance-scored paper rankings with evidence-grounded recommendations. The system addresses a core tension in AI-powered academic search: frontier LLMs can parse complex research intents but hallucinate citations and burn compute, while traditional keyword search stays grounded but misses nuance. PaSaMaster splits the difference by treating retrieval as an iterative process rather than one-shot query matching.
The architecture separates planning from retrieval. A frontier LLM handles intent understanding—parsing what a researcher actually wants from a vague or multi-faceted query—while large-scale retrieval and relevance scoring run on lightweight models against customized corpora. Ranked evidence from each search cycle reveals gaps in coverage, refines the intent, and guides follow-up searches. Because sources come from ranked evidence rather than generation, PaSaMaster maintains zero source hallucination across all test cases.
Evaluated on the PaSaMaster Benchmark spanning 38 scientific disciplines, the system improved F1-score by 15.6× over traditional keyword retrieval. Generative LLMs showed hallucination rates up to 37.79% when asked to recommend papers directly. PaSaMaster outperformed GPT-5.2 by 30.0% on retrieval quality while using only 1% of the computational cost. Code and the preprint are available at github.com/sjtu-sai-agents/PaSaMaster.