ZenCreator

Pro-grade AI content creation. Image, video, face-swap, lipsync, and upscaling behind one API.

14 tools

Up to 4K

4.4(288)

Visit

Loading…

Research

Causal graphs reveal how LLMs organize concepts during inference

Researchers map LLM reasoning with causal graphs and counterfactual chains, revealing class-discriminative concept dependencies across diagnosis, sentiment, and judge tasks.

ByAlex Sokoloff·June 8, 2026

Causal graphs reveal how LLMs organize concepts during inference

A new preprint describes a method for building causal graphs that expose how large language models organize high-level concepts during inference. The four-phase pipeline discovers interpretable concepts from text examples, maps inputs to LLM-perceived concept states, and uses MCMC-inspired counterfactual augmentation to stabilize causal discovery. The resulting graphs show which concepts the model treats as causes and which as effects when producing a prediction.

The authors tested the approach on three LLMs across disease diagnosis, sentiment analysis, and LLM-as-a-judge classification tasks. They evaluated the learned graphs for predictive fidelity—how well the graph reproduces the model's outputs—and structural stability under resampling. The counterfactual augmentation procedure expands sparse observational data by generating chains of counterfactuals, which the paper shows converge and improve downstream causal discovery with the σ-CG algorithm.

Building the causal graph

The pipeline starts by prompting the target LLM to propose class-discriminative concepts for a given task. It then maps each input example to binary concept states as perceived by the LLM. Because observational data alone is too sparse for reliable causal discovery, the method generates counterfactual examples by flipping one concept at a time and asking the LLM to rewrite the input accordingly. An MCMC-inspired procedure chains these flips, producing a richer dataset that captures how concept changes propagate. The final step applies σ-CG, a causal discovery algorithm, to the augmented data, yielding a directed acyclic graph of concept dependencies.

Unlike prior work that uses LLMs to recover causal graphs of external-world processes, this approach treats the LLM itself as the system under study. The authors report that the discovered graphs align with expected reasoning patterns in each domain—disease diagnosis, sentiment classification, and judgment tasks—providing a foundation for concept-level explainability. The paper was authored by Nirit Nussbaum-Hoffer, Nitay Calderon, Liat Ein-Dor, and Roi Reichart.

ZenCreator

Causal graphs reveal how LLMs organize concepts during inference

Building the causal graph

More in Research

Staleness-Adaptive Trust Region cuts asynchronous RL performance loss to 3% at 8× policy lag

Distilled RL transfers knowledge across model families without unconditional imitation

Qwen-Music generates full vocal songs from text and lyrics

LongStraw trains RL models at 2.1M tokens on eight H20 GPUs

ShortOPD cuts pruned LLM recovery time by 75% while raising generation quality 9×