Pearl: Machine learning cannot solve causal inference from data alone
Judea Pearl argues that machine learning's reliance on data-only approaches faces provable mathematical limits, citing causal inference problems that no amount of correlation data can solve.

Judea Pearl, the 2011 Turing Award recipient, argues that machine learning's dominant paradigms—learning everything from data and mimicking neural architecture—run into provable mathematical walls. In a 2024 interview, Pearl stated that certain causal questions cannot be answered by observing data alone, regardless of dataset size or model sophistication. "It's not a matter of opinion. It's a matter of mathematical proof," he said, pointing to his decades of work on causal inference.
Pearl's core example: no amount of observational data on aspirin use and headache frequency can prove whether aspirin causes or relieves headaches. The correlation is visible in the data, but the causal direction requires outside knowledge—a model of how the world works that data alone cannot supply. This distinction sits at the heart of Pearl's three-layer hierarchy: correlation (what the data shows), causation (what happens when you intervene), and counterfactuals (what would have happened under different conditions). Moving up that ladder requires incorporating assumptions and domain knowledge that the tabula rasa philosophy explicitly rejects.
The machine learning community's resistance to these ideas, Pearl suggests, stems from two entrenched beliefs: that models should derive everything from data without prior knowledge, and that architectures should mimic biological neurons rather than encode symbolic rules. He notes that formal solutions to causal inference problems exist but remain underadopted, overshadowed by hype around scaling data and parameters. Pearl's critique does not dismiss neural networks outright—he acknowledges their pattern-matching power—but insists that certain reasoning tasks lie permanently out of reach for systems that refuse to encode causal structure.
Pearl's position is not new; his 2018 book The Book of Why made similar arguments to a general audience. What has changed is the scale of the gap: as foundation models grow to trillions of parameters and train on internet-scale corpora, the causal inference problems Pearl identified in 2011 remain unsolved. The resurfacing of this interview in machine learning forums this week has reignited debate over whether the field's data-first orthodoxy is hitting a ceiling that more compute cannot lift.