Two-field framework unifies grokking, emergent capabilities, and prebiotic chemistry phase transitions
A new arXiv preprint proposes that deep learning phase transitions and non-equilibrium chemical selection share a common mathematical structure governed by entropy production and information quasi-potential, introducing two testable order parameters with falsifiable predictions.
A preprint posted to arXiv this week argues that abrupt learning transitions—grokking, emergent capabilities, sudden alignment shifts—and phase changes in prebiotic chemical networks are two sides of the same coin. The paper, "Phase Transitions in Driven Informational Systems," frames both as stochastic processes driven by two gradient fields: an entropy production rate Σ and an information quasi-potential Φ_I = -ln p, where p is the stationary probability density. The authors introduce two order parameters, an adversarial breakdown threshold α† and a self-referential coupling threshold κ_c, whose joint scaling defines a candidate universality class with exponents (γ₁, γ₂).
The framework is explicitly designed to produce falsifiable predictions that single-field gradient accounts—those relying on loss alone or on representational compression metrics—cannot match. The paper claims consistency with 2024–2026 empirical findings on alignment transitions, adversarial breakdown scaling, and partial introspection in large language models, though it does not name specific models or datasets in the abstract. The two-field structure is borrowed from non-equilibrium statistical physics, where driven chemical reaction networks exhibit phase transitions that resist explanation by equilibrium thermodynamics or single-gradient dynamics.
The preprint outlines the geometric structure of the framework but stops short of closed-form solutions for the exponents or explicit recipes for measuring α† and κ_c in a trained network. It positions itself as a unifying lens rather than a finished theory, offering a candidate universality class that would, if validated, place deep learning phase transitions and prebiotic selection in the same mathematical family. The authors note that existing singular learning theory and information-theoretic progress measures are compatible with the two-field view but do not by themselves predict the joint scaling behavior.
What happens next depends on whether the order parameters can be extracted from real training runs and whether the predicted exponents hold across architectures. The paper does not yet provide code, benchmark tables, or a worked example on a public model, so the immediate test is whether other groups can operationalize α† and κ_c on GPT-scale checkpoints or on smaller grokking setups. If the joint scaling law survives contact with data, the framework could reshape how we think about sudden capability jumps and adversarial fragility; if it doesn't, it remains an elegant analogy between two fields that may not share deep structure after all.
