Neuron Auctions: researchers propose bidding on LLM internal activations for ads
A new arXiv preprint introduces a mechanism that lets advertisers bid on specific neurons inside language models to embed brand mentions without disrupting conversation flow, sidestepping the ad-monetization problem in chatbots.
Researchers have proposed Neuron Auctions, a mechanism that lets advertisers bid on internal model activations rather than surface-level text slots when placing ads in LLM-generated conversations. The preprint, posted May 12 on arXiv, argues that existing approaches—prompt injection or fixed position slots—break semantic coherence and lack the fine-grained control needed for rigorous auction design.
The authors identify brand-specific neurons in feed-forward network layers and show that competing brands activate in approximately orthogonal subspaces. That near-independence lets them define continuous intervention budgets—neuron counts and amplification factors—as auctionable commodities. The resulting menu-based auction is strategy-proof by construction and optimizes platform revenue while penalizing interventions that degrade user experience.
What stands out
- 01Targets internal representations, not text slots. Instead of bidding on "position 3 in the output," advertisers bid on how many brand-specific neurons to amplify and by how much. The model's natural generation process weaves the brand mention into the response.
- 02Exploits orthogonal brand subspaces. Mechanistic interpretability reveals that different brands activate distinct, nearly orthogonal neuron clusters in FFN layers. This independence allows simultaneous, disentangled interventions without cross-brand interference.
- 03Guarantees strategy-proofness. The continuous menu auction ensures truthful bidding—advertisers have no incentive to misreport their valuations—a property that rigid slot auctions struggle to achieve.
- 04Prices in user experience. The platform's objective function includes an explicit user-utility penalty, dynamically pricing out overly aggressive neuron amplifications that would harm conversation quality.
