3

STaR-GATE: Teaching Language Models to Ask Clarifying Questions

When prompting language models to complete a task, users often leave important aspects unsaid. While asking questions could resolve this ambiguity (GATE; Li et al., 2023), models often struggle to ask good questions. We explore a language model's …

Anticipating the risks and benefits of counterfactual world simulation models

This paper examines the transformative potential of Counterfactual World Simulation Models (CWSMs). CWSMs use pieces of multi-modal evidence, such as the CCTV footage or sound recordings of a road accident, to build a high-fidelity 3D reconstruction …

Off The Rails: Procedural Dilemma Generation for Moral Reasoning

As AI systems like language models are increasingly integrated into making decisions that affect people, it's critical to ensure that these systems have sound moral reasoning. To test whether they do, we need to develop systematic evaluations. Recent …

Social Contract AI: Aligning AI Assistants with Implicit Group Norms

We explore the idea of aligning an AI assistant by inverting a model of users' (unknown) preferences from observed interactions. To validate our proposal, we run proof-of-concept simulations in the economic ultimatum game, formalizing user …

MoCa: Measuring human-language model alignment on causal and moral judgment tasks

Human commonsense understanding of the physical and social world is organized around intuitive theories. These theories support making causal and moral judgments. When something bad happens, we naturally ask: who did what, and why? A rich literature …

Realism of Visual, Auditory, and Haptic Cues in Phenomenal Causality

Interacting in real environments, such as manipulating objects, involves multisensory information. However, little is known about how multisensory cue characteristics help us determine what has occurred in a scene, including whether two events were …

Understanding Social Reasoning in Language Models with Language Models

As Large Language Models (LLMs) become increasingly integrated into our everyday lives, understanding their ability to comprehend human mental states becomes critical for ensuring effective interactions. However, despite the recent attempts to assess …

A computational model of responsibility judgments from counterfactual simulations and intention inferences

How responsible someone is for an outcome depends on both the causal role of their actions, and what those actions reveal about their moral character. Prior work has successfully modeled people's causal attributions and mental state inferences using …

Father, don't forgive them, for they could have known what they're doing

What someone knew matters for how we hold them responsible. In three studies, we explore people's responsibility judgments for negative outcomes to knowledgeable versus ignorant agents. We manipulate whether agents arrived at their knowledge state …

Show and tell: Learning causal structures from observations and explanations

There are at least three ways of learning how the world works: learning from observations, from interventions, and from explanations. Prior work on causal inference focused on how people learn causal structures through observation and intervention. …