Tobias Gerstenberg

Education

  • Assistant Professor in Psychology, 2018-present

    Stanford University

  • Postdoc, 2013-2018

    Massachusetts Institute of Technology

  • PhD in Cognitive Science, 2013

    University College London

  • MSc in Cognitive Science, 2008

    University College London

I’m the PI of the Causality in Cognition Lab (CiCL). You can see me in action here.

Research interests

Here are some of the things I’m interested in:

You can find out more about what we do in the CiCL, what we value, and how to join us here. You can also take a look at my research statement.

Contact

Email: gerstenberg@stanford.edu
Office: Room 302, Building 420

Teaching

Publications

(2024). Imagining and building wise machines: The centrality of AI metacognition. arXiv.

Preprint

(2024). From Artifacts to Human Lives: Investigating the Domain-Generality of Judgments about Purposes. Journal of Experimental Psychology: General.

Preprint

(2024). Self-supervised alignment with mutual information: Learning to follow principles without preference labels. Advances in Neural Information Processing Systems.

Preprint PDF Github

(2024). MARPLE: A Benchmark for Long-Horizon Inference. Advances in Neural Information Processing Systems.

Preprint PDF Project website

(2024). Causation, Meaning, and Communication. PsyArXiv.

Preprint Github

(2024). Human-like Affective Cognition in Foundation Models. arXiv.

Preprint PDF

(2024). Whodunnit? Inferring what happened from multimodal evidence. Proceedings of the 46th Annual Conference of the Cognitive Science Society.

Preprint PDF Poster Github

(2024). Towards a computational model of responsibility judgments in sequential human-AI collaboration. Proceedings of the 46th Annual Conference of the Cognitive Science Society.

Preprint PDF Github

(2024). Without his cookies, he's just a monster: A counterfactual simulation model of social explanation. Proceedings of the 46th Annual Conference of the Cognitive Science Society.

Preprint PDF Poster Github

(2024). Chain versus common cause: Biased causal strength judgments in humans and large language models. Proceedings of the 46th Annual Conference of the Cognitive Science Society.

PDF Poster OSF

(2024). Do as I explain: Explanations communicate optimal interventions. Proceedings of the 46th Annual Conference of the Cognitive Science Society.

Preprint PDF Github

(2024). Biased causal strength judgments in humans and large language models. ICLR 2024 Workshop on Representational Alignment.

PDF Poster

(2024). Resource-rational moral judgment. Proceedings of the 46th Annual Conference of the Cognitive Science Society.

PDF Poster

(2024). Procedural dilemma generation for evaluating moral reasoning in humans and language models. Proceedings of the 46th Annual Conference of the Cognitive Science Society.

Preprint PDF Poster Github

(2024). STaR-GATE: Teaching Language Models to Ask Clarifying Questions. Conference on Language Modeling (COLM).

Preprint PDF Github

(2023). Anticipating the risks and benefits of counterfactual world simulation models. AI Meets Moral Philosophy and Moral Psychology Workshop (NeurIPS 2023).

Preprint PDF

(2023). Off The Rails: Procedural Dilemma Generation for Moral Reasoning. AI Meets Moral Philosophy and Moral Psychology Workshop (NeurIPS 2023).

PDF Github Project website

(2023). Social Contract AI: Aligning AI Assistants with Implicit Group Norms. Socially Responsible Language Modelling Research Workshop (NeurIPS 2023).
Oral at NeurIPS Workshop

Preprint PDF Github

(2023). MoCa: Measuring human-language model alignment on causal and moral judgment tasks. Advances in Neural Information Processing Systems.

Preprint PDF Github Project website

(2023). Probabilistic programs as a unifying language of thought. Reverse-engineering the mind: The Bayesian approach to cognitive science.

PDF

(2023). Realism of Visual, Auditory, and Haptic Cues in Phenomenal Causality. IEEE World Haptics Conference (WHC).
Best Work-in-Progress Award

PDF Github

(2023). Understanding Social Reasoning in Language Models with Language Models. Advances in Neural Information Processing Systems.
Spotlight at NeurIPS 2023 Datasets and Benchmarks Track

Preprint PDF Github Project website

(2023). Making a positive difference: Criticality in groups. Cognition.

Preprint PDF Link Github

(2023). A computational model of responsibility judgments from counterfactual simulations and intention inferences. Proceedings of the 45th Annual Conference of the Cognitive Science Society.

Preprint PDF Poster Github

(2023). Father, don't forgive them, for they could have known what they're doing. Proceedings of the 45th Annual Conference of the Cognitive Science Society.

Preprint PDF Github

(2023). Show and tell: Learning causal structures from observations and explanations. Proceedings of the 45th Annual Conference of the Cognitive Science Society.

Preprint PDF Poster Github

(2023). You are what you're for: Essentialist categorization in large language models. Proceedings of the 45th Annual Conference of the Cognitive Science Society.

Preprint PDF Poster Github

(2023). A Semantics for Causing, Enabling, and Preventing Verbs Using Structural Causal Models. Proceedings of the 45th Annual Conference of the Cognitive Science Society.

PDF Poster Github

(2023). Causal Reasoning Across Agents and Objects. Proceedings of the 45th Annual Conference of the Cognitive Science Society.

PDF Poster Github

(2023). Learning what matters: Causal abstraction in human inference. Proceedings of the 45th Annual Conference of the Cognitive Science Society.

Preprint PDF Github

(2023). Teleology and generics. Proceedings of the 45th Annual Conference of the Cognitive Science Society.

Preprint PDF Poster Github

(2023). Mental Jenga: A counterfactual simulation model of causal judgments about physical support. Journal of Experimental Psychology: General.

Preprint PDF Link Github

(2023). Active causal structure learning in continuous time. Cognitive Psychology.

Preprint PDF Link Github

(2023). Explanations can reduce overreliance on AI systems during decision-making. Proceedings of the ACM on Human-Computer Interaction.
Best Paper Honorable Mention at CSCW 2023

Preprint PDF Press: HAI blog

(2023). Probabilistic models of physical reasoning. Reverse engineering the mind: The Bayesian approach to cognitive science.

PDF

(2022). What would have happened? Counterfactuals, hypotheticals, and causal judgments. Philosophical Transactions of the Royal Society B: Biological Sciences.

Preprint PDF Video Link Github

(2022). Stop, children what's that sound? Multi-modal inference through mental simulation. Cognitive Science Proceedings.

Preprint PDF Poster Github

(2022). Looking into the past: Eye-tracking mental simulation in physical inference. Cognitive Science Proceedings.

Preprint PDF Poster Github

(2022). Do humans trust advice more if it comes from AI? an analysis of human-AI interactions. Proceedings of the 2022 AAAI/ACM Conference on AI, Ethics, and Society.

Preprint PDF Github

(2022). Inference from explanation. Journal of Experimental Psychology: General.

Preprint PDF Poster Video Github OSF

(2022). Uncalibrated models can improve human-AI collaboration. Advances in Neural Information Processing Systems.

Preprint PDF Press: Stanford

(2021). Moral dynamics: Grounding moral judgment in intuitive physics and intuitive psychology. Cognition.

Preprint PDF Link Github OSF

(2021). Predicting responsibility judgments from dispositional inferences and causal attributions. Cognitive Psychology.

Preprint PDF Link Github OSF

(2021). A counterfactual simulation model of causal judgments for physical events. Psychological Review.

Preprint PDF Link Github OSF Press: HAI News

(2021). Who went fishing? Inferences from social evaluations. Proceedings of the 43rd Annual Conference of the Cognitive Science Society.

Preprint PDF Poster Video Github OSF

(2021). The trajectory of counterfactual simulation in development. Developmental Psychology.

Preprint PDF Link Github OSF

(2020). Expectations affect physical causation judgments. Journal of Experimental Psychology: General.

Preprint PDF Link Github OSF

(2020). The language of causation. Proceedings of the 42nd Annual Conference of the Cognitive Science Society.

PDF Poster Video Github

(2020). Whom will Granny thank? Thinking about what could have been informs children's inferences about relative helpfulness. Proceedings of the 42nd Annual Conference of the Cognitive Science Society.

PDF Poster Video

(2020). Causal responsibility and robust causation. Frontiers in Psychology.

PDF Link Github

(2020). Moral values pervade implicit and explicit causal attribution: Evidence from basic language processing. Cognitive Science.

PDF Link Github

(2019). Explaining intuitive difficulty judgments by modeling physical effort and risk. Proceedings of the 41st Annual Conference of the Cognitive Science Society.

Preprint PDF

(2019). Intervening in time. Time and Causality Across the Sciences.

PDF Link

(2019). Quantitative causal selection patterns in token causation. PLoS ONE.

PDF Github

(2019). The trajectory of counterfactual simulation in development. Proceedings of the 41st Annual Conference of the Cognitive Science Society.

PDF OSF

(2018). Time in causal structure learning. Journal of Experimental Psychology: Learning, Memory, and Cognition.

PDF

(2018). Intuitive experimentation in the physical world. Cognitive Psychology.

Preprint PDF Github Press: Medium

(2018). Lucky or clever? From expectations to responsibility judgments. Cognition.

PDF Github OSF

(2018). What's fair? How children assign reward to members of teams with differing causal structures. Cognition.

PDF Github OSF

(2018). Judgments of actual causation approximate the effectiveness of interventions. PsyArXiv.

Preprint

(2018). Tiptoeing around it: Inference from absence in potentially offensive speech. Proceedings of the 40th Annual Conference of the Cognitive Science Society.

PDF

(2018). What happened? Reconstructing the past from vision and sound. Proceedings of the 40th Annual Conference of the Cognitive Science Society.

Preprint PDF Poster Github

(2017). Eye-tracking causality. Psychological Science.

Preprint PDF Link Github OSF Press: MIT News Press: Seeker Press: MedicalResearch.com

(2017). Causal learning from interventions and dynamics in continuous time. Proceedings of the 39th Annual Conference of the Cognitive Science Society.

PDF

(2017). Causation in legal and moral reasoning. Oxford Handbook of Causal Reasoning.

PDF Link

(2017). Faulty towers: A hypothetical simulation model of physical support. Proceedings of the 39th Annual Conference of the Cognitive Science Society.

PDF Slides Github

(2017). Intuitive Theories. Oxford Handbook of Causal Reasoning.

PDF Link

(2017). Marbles in inaction: Counterfactual simulation and causation by omission. Proceedings of the 39th Annual Conference of the Cognitive Science Society.

PDF Slides

(2017). Physical problem solving: Joint planning with symbolic, geometric, and dynamic constraints. Proceedings of the 39th Annual Conference of the Cognitive Science Society.

PDF Poster Github

(2016). Implicit measurement of motivated causal attribution. Proceedings of the 38th Annual Conference of the Cognitive Science Society.

PDF

(2016). Natural science: Active learning in dynamic physical microworlds. Proceedings of the 38th Annual Conference of the Cognitive Science Society.

PDF

(2016). Plans, habits, and theory of mind. PLoS ONE.

PDF

(2016). Understanding ``almost'': Empirical and computational studies of near misses. Proceedings of the 38th Annual Conference of the Cognitive Science Society.

PDF Github

(2015). A difference-making framework for intuitive judgments of responsibility. Oxford Studies in Agency and Responsibility.

PDF Link

(2015). Causal conceptions in social explanation and moral evaluation: A historical tour. Perspectives on Psychological Science.

PDF

(2015). Causal superseding. Cognition.

PDF

(2015). Concepts in a probabilistic language of thought. The Conceptual Mind: New Directions in the Study of Concepts.

Preprint PDF Link

(2015). Go fishing! Responsibility judgments when cooperation breaks down. Proceedings of the 37th Annual Conference of the Cognitive Science Society.

PDF Poster

(2015). How, whether, why: Causal judgments as counterfactual contrasts. Proceedings of the 37th Annual Conference of the Cognitive Science Society.

PDF Dataset Poster

(2015). Inference of intention and permissibility in moral decision making. Proceedings of the 37th Annual Conference of the Cognitive Science Society.

PDF

(2015). Responsibility judgments in voting scenarios. Proceedings of the 37th Annual Conference of the Cognitive Science Society.

PDF Dataset

(2014). Attributing responsibility: Actual and counterfactual worlds. Oxford Studies in Experimental Philosophy.

PDF Link

(2014). Causal supersession. Proceedings of the 36th Annual Conference of the Cognitive Science Society.

Preprint PDF

(2014). From counterfactual simulation to causal judgment. Proceedings of the 36th Annual Conference of the Cognitive Science Society.

PDF Dataset Demo

(2014). The order of things: Inferring causal structure from temporal patterns. Proceedings of the 36th Annual Conference of the Cognitive Science Society.

PDF Demo

(2014). Wins above replacement: Responsibility attributions as counterfactual replacements. Proceedings of the 36th Annual Conference of the Cognitive Science Society.

PDF Dataset Poster Demo

(2013). Back on track: Backtracking in counterfactual reasoning. Proceedings of the 35th Annual Conference of the Cognitive Science Society.

PDF Dataset Poster Demo

(2013). Causal responsibility and counterfactuals. Cognitive Science.

PDF Dataset Demo

(2013). Making a difference: Responsibility, causality, and counterfactuals. Unpublished PhD thesis.

PDF

(2012). Finding fault: Counterfactuals and causality in group attributions. Cognition.

PDF Dataset Demo

(2012). Noisy Newtons: Unifying process and dependency accounts of causal attribution. Proceedings of the 34th Annual Conference of the Cognitive Science Society.

PDF Dataset Poster Demo

(2012). Ping Pong in Church: Productive use of concepts in human probabilistic inference. Proceedings of the 34th Annual Conference of the Cognitive Science Society.

PDF Dataset Poster Demo

(2012). Why blame Bob? Probabilistic generative models, counterfactual reasoning, and blame attribution. Proceedings of the 34th Annual Conference of the Cognitive Science Society.

PDF

(2011). Beyond outcomes: The influence of intentions and deception. Proceedings of the 33rd Annual Conference of the Cognitive Science Society.

PDF

(2011). Blame the skilled. Proceedings of the 33rd Annual Conference of the Cognitive Science Society.

PDF Dataset Poster Demo

(2011). Rational order effects in responsibility attributions. Proceedings of the 33rd Annual Conference of the Cognitive Science Society.

PDF Dataset Poster

(2010). Observing and Intervening: Rational and Heuristic Models of Causal Decision Making. Open Psychology Journal.

PDF

(2010). The dice are cast: The role of intended versus actual contributions in responsibility attribution. Proceedings of the 32nd Annual Conference of the Cognitive Science Society.

PDF Dataset Poster

(2009). The allocation of responsibility amongst multiple causes. Unpublished MSc thesis.

PDF