Autoresearch agent discovers fairness mechanisms emerge only under maximin objectives
A new preprint shows how an autonomous researcher agent can redesign LLM policy-synthesis systems to solve multi-agent social dilemmas, discovering that fairness mechanisms emerge only when optimizing for equity rather than efficiency.
A preprint by Víctor Gallego introduces a two-level autoresearch system where an outer-loop AI agent autonomously redesigns the inner-loop pipeline of an LLM policy-synthesis system for multi-agent Sequential Social Dilemmas (SSDs). The researcher agent, implemented as a coding agent, reads source code, edits system prompts, feedback functions, helper libraries, and iteration logic, then runs evaluations and decides what to keep. Across two games—Cleanup and Gathering—two policy-synthesizer LLMs, and two welfare objectives (utilitarian efficiency and Rawlsian maximin), the researcher reliably exceeded hand-designed baselines, tightened run-to-run variance, and outperformed prompt-only optimization.
The system's most striking finding is objective-dependent pipeline discovery. Only when optimizing for maximin—a fairness-focused welfare objective—did the researcher inject an explicit fairness mechanism into synthesizer pipelines. That class of mechanism is absent from the researcher's own objective-agnostic system prompt and from every efficiency-optimized pipeline. The authors frame this as an information-design problem: the researcher chooses what to reveal to the boundedly rational synthesizer as a function of the welfare objective.
Pipeline discovery and variance reduction
The outer-loop agent iteratively modifies the inner-loop pipeline by editing system prompts, feedback functions, helper libraries, and iteration logic—everything that shapes how the policy synthesizer generates cooperative behavior. After each modification, it runs evaluations and decides whether to keep the change. This two-level approach sharply reduces variance compared to hand-designed baselines and prompt-only optimization. Critically, the discovered pipelines are not general-purpose; they are tailored to the welfare objective being optimized for, suggesting that the researcher agent learns to encode different design principles depending on whether the goal is efficiency or fairness.
Code and full methodology are available on GitHub. The preprint (arXiv:2605.30003) was posted in May 2026.


