ThoughtTrace dataset pairs 17,058 conversation turns with user reasoning
Researchers released ThoughtTrace, a dataset linking 2,155 real conversations across 20 language models with users' self-reported thoughts, establishing a new data modality for studying cognitive dynamics in human-AI interaction.

ThoughtTrace pairs real-world multi-turn conversations with users' self-reported thoughts—their reasons for sending prompts and reactions to responses. The dataset comprises 1,058 users, 2,155 conversations, 17,058 turns, and 10,174 thought annotations collected across 20 language models. Unlike existing conversational datasets that capture only what people say, ThoughtTrace records what they think, providing a window into the cognitive dynamics behind human-AI interaction.
Researchers found that user thoughts are semantically distinct from the messages themselves and difficult for frontier LLMs to infer from context alone. This gap matters for alignment: if models can't reliably infer what users actually want from what they type, training on message-response pairs alone leaves latent preferences unaddressed. Thoughts vary in content and correlate with conversation stages, suggesting they carry information not present in the observable dialogue. The dataset demonstrates practical utility in two ways. First, thoughts improve user-behavior prediction when provided as inference-time context. Second, thought-guided rewrites generate fine-grained alignment signals for training personalized assistants—optimizing not just on whether a response matches the user's next message, but on whether it addresses the user's stated internal goal. ThoughtTrace captures long-horizon, topically diverse interactions across multiple conversation stages, from initial queries through follow-ups and clarifications. The paper and dataset are available on HuggingFace.