PUTTING THE HUMAN INTO THE AI EQUATION Anca Dragan Anca Dragan is an assistant professor in the Department of Electrical Engineering and Computer Sciences at UC Berkeley. She co-founded and serves on the steering committee for the Berkeley AI Research (BAIR) Lab and is a co-principal investigator in Berkeley’s Center for Human-Compatible Al. At the core of artificial intelligence is our mathematical definition of what an AI agent (a robot) is. When we define a robot, we define states, actions, and rewards. Think of a delivery robot, for instance. States are locations in the world, and actions are motions that the robot makes to get from one position to a nearby one. To enable the robot to decide on which actions to take, we define a reward function—a mapping from states and actions to scores indicating how good that action was in that state—and have the robot choose actions that accumulate the most “reward.” The robot gets a high reward when it reaches its destination, and it incurs a small cost every time it moves; this reward function incentivizes the robot to get to the destination as quickly as possible. Similarly, an autonomous car might get a reward for making progress on its route and incur a cost for getting too close to other cars. Given these definitions, a robot’s job is to figure out what actions it should take in order to get the highest cumulative reward. We’ve been working hard in AI on enabling robots to do just that. Implicitly, we’ve assumed that if we’re successful—if robots can take any problem definition and turn into a policy for how to act—we will get robots that are useful to people and to society. We haven’t been too wrong so far. If you want an AI that classifies cells as either cancerous or benign, or a robot that vacuums the living room rug while you’re at work, we've got you covered. Some real-world problems can indeed be defined in isolation, with clear-cut states, actions, and rewards. But with increasing AI capability, the problems