The Good, the Bad, and the Irrelevant: Neural Mechanisms of Learning Real and Hypothetical Rewards and Effort.
Scholl J., Kolling N., Nelissen N., Wittmann MK., Harmer CJ., Rushworth MFS.
UNLABELLED: Natural environments are complex, and a single choice can lead to multiple outcomes. Agents should learn which outcomes are due to their choices and therefore relevant for future decisions and which are stochastic in ways common to all choices and therefore irrelevant for future decisions between options. We designed an experiment in which human participants learned the varying reward and effort magnitudes of two options and repeatedly chose between them. The reward associated with a choice was randomly real or hypothetical (i.e., participants only sometimes received the reward magnitude associated with the chosen option). The real/hypothetical nature of the reward on any one trial was, however, irrelevant for learning the longer-term values of the choices, and participants ought to have only focused on the informational content of the outcome and disregarded whether it was a real or hypothetical reward. However, we found that participants showed an irrational choice bias, preferring choices that had previously led, by chance, to a real reward in the last trial. Amygdala and ventromedial prefrontal activity was related to the way in which participants' choices were biased by real reward receipt. By contrast, activity in dorsal anterior cingulate cortex, frontal operculum/anterior insula, and especially lateral anterior prefrontal cortex was related to the degree to which participants resisted this bias and chose effectively in a manner guided by aspects of outcomes that had real and more sustained relationships with particular choices, suppressing irrelevant reward information for more optimal learning and decision making. SIGNIFICANCE STATEMENT: In complex natural environments, a single choice can lead to multiple outcomes. Human agents should only learn from outcomes that are due to their choices, not from outcomes without such a relationship. We designed an experiment to measure learning about reward and effort magnitudes in an environment in which other features of the outcome were random and had no relationship with choice. We found that, although people could learn about reward magnitudes, they nevertheless were irrationally biased toward repeating certain choices as a function of the presence or absence of random reward features. Activity in different brain regions in the prefrontal cortex either reflected the bias or reflected resistance to the bias.