A new study from New York University and the University of Tübingen, led by Hanna M. Dettki, Brenden M. Lake, Charley M. Wu, and Bob Rehder, asks whether AI can reason about causes as humans do or if it relies on patterns instead. Their paper, “Do Large Language Models Reason Causally Like Us? Even Better?”, probes four popular models—GPT-3.5, GPT-4o, Claude-3, and Gemini-Pro—to see whether they grasp complex causal structures or merely mimic human language.
How the study tested causal reasoning in AIThe researchers compared human reasoning with four LLMs—GPT-3.5, GPT-4o, Claude-3, and Gemini-Pro—using collider graphs, a classic test in causal inference. Participants (both human and AI) were asked to evaluate the likelihood of an event given certain causal relationships. The core question: do LLMs reason causally in the same way humans do, or do they follow a different logic?
AI now handles molecular simulations: Thanks to MDCrow
Key findings: AI can reason but not like humansThe results revealed a spectrum of causal reasoning among AI models.
Interestingly, humans often apply heuristics that deviate from strict probability theory—such as the “explaining away” effect, where observing one cause reduces the likelihood of another. While AI models recognized this effect, their responses varied significantly based on training data and context.
AI vs. human reasoning: A fundamental differenceOne of the most intriguing insights from the study is that LLMs don’t just mimic human reasoning—they approach causality differently. Unlike humans, whose judgments remained relatively stable across different contexts, AI models adjusted their reasoning depending on domain knowledge (e.g., economics vs. sociology).
This suggests that while AI can be more precise in certain structured tasks, it lacks the flexibility of human thought when dealing with ambiguous or multi-causal situations.
Why this matters for AI in decision-makingThe study reveals an important limitation: LLMs may not generalize causal knowledge beyond their training data without strong guidance. This has critical implications for deploying AI in real-world decision-making, from medical diagnoses to economic forecasting.
LLMs might outperform humans in probability-based inference but their reasoning remains fundamentally different—often lacking the intuitive, adaptive logic humans use in everyday problem-solving.
In other words, AI can reason about causality—but not quite like us.
Featured image credit: Kerem Gülen/Ideogram
All Rights Reserved. Copyright , Central Coast Communications, Inc.