AI systems with emotional intelligence could learn faster and be more helpful
Click here to read the full article. on IEEE
AI systems that can predict and respond to human emotions are one thing, but what if an AI system could actually experience something akin to human emotions? If an agent was motivated by fear, curiosity, or delight, how would that change the technology and its capabilities? To explore this idea, we trained agents that had the basic emotional drives of fear and happy curiosity.
With this work, we’re trying to address a few problems in a field of AI called reinforcement learning, in which an AI agent learns how to do a task by relentless trial and error. Over millions of attempts, the agent figures out the best actions and strategies to use, and if it successfully completes its mission, it earns a reward. Reinforcement learning has been used to train AI agents to beat humans at the board game Go, the video game StarCraft II, and a type of poker known as Texas Hold’em.
While this type of machine learning works well with games, where winning offers a clear reward, it’s harder to apply in the real world. Consider the challenge of training a self-driving car, for example. If the reward is getting safely to the destination, the AI will spend a lot of time crashing into things as it tries different strategies, and will only rarely succeed. That’s the problem of sparse external rewards. It might also take a while for the AI to figure out which specific actions are most important—is it stopping for a red light or speeding up on an empty street? Because the reward comes only at the end of a long sequence of actions, researchers call this the credit-assignment problem.
Now think about how a human behaves while driving. Reaching the destination safely is still the goal, but the person gets a lot of feedback along the way. In a stressful situation, such as speeding down the highway during a rainstorm, the person might feel his heart thumping faster in his chest as adrenaline and cortisol course through his body. These changes are part of the person’s fight-or-flight response, which influences decision making. The driver doesn’t have to actually crash into something to feel the difference between a safe maneuver and a risky move. And when he exits the highway and his pulse slows, there’s a clear correlation between the event and the response.
We wanted to capture those correlations and create an AI agent that in some sense experiences fear. So we asked people to steer a car through a maze in a simulated environment, measured their physiological responses in both calm and stressful moments, then used that data to train an AI driving agent. We programmed the agent to receive an extrinsic reward for exploring a good percentage of the maze, and also an intrinsic reward for minimizing the emotional state associated with dangerous situations.
We found that combining these two rewards created agents that learned much faster than one that received only the typical extrinsic reward. These agents also crashed less often. What we found particularly interesting, though, is that an agent motivated primarily by the intrinsic reward didn’t perform very well: If we dialed down the external reward, the agent became so risk averse that it didn’t try very hard to accomplish its objective.
During another effort to build intrinsic motivation into an AI agent, we thought about human curiosity and how people are driven to explore because they think they may discover things that make them feel good. In related AI research, other groups have captured something akin to basic curiosity, rewarding agents for seeking novelty as they explore a simulated environment. But we wanted to create a choosier agent that sought out not just novelty but novelty that was likely to make it “happy.”
To gather training data for such an agent, we asked people to drive a virtual car within a simulated maze of streets, telling them to explore but giving them no other objectives. As they drove, we used facial-expression analysis to track smiles that flitted across their faces as they navigated successfully through tricky parts or unexpectedly found the exit of the maze. We used that data as the basis for the intrinsic reward function, meaning that the agent was taught to maximize situations that would make a human smile. The agent received the external reward by covering as much territory as possible.
Again, we found that agents that incorporated intrinsic drive did better than typically trained agents—they drove in the maze for a longer period before crashing into a wall, and they explored more territory. We also found that such agents performed better on related visual-processing tasks, such as estimating depth in a 3D image and segmenting a scene into component parts.
We’re at the very beginning of mimicking human emotions in silico, and there will doubtless be philosophical debate over what it means for a machine to be able to imitate the emotional states associated with happiness or fear. But we think such approaches may not only make for more efficient learning, they may also give AI systems the crucial ability to generalize.
Today’s AI systems are typically trained to carry out a single task, one that they might get very good at, yet they can’t transfer their painstakingly acquired skills to any other domain. But human beings use their emotions to help navigate new situations every day; that’s what people mean when they talk about using their gut instincts.
We want to give AI systems similar abilities. If AI systems are driven by humanlike emotion, might they more closely approximate humanlike intelligence? Perhaps simulated emotions could spur AI systems to achieve much more than they would otherwise. We’re certainly curious to explore this question—in part because we know our discoveries will make us smile.
Click here to read the full article.
About the Author
Credits IEE Spectrum and Microsoft Research