Oxford University Centre for Integrative Neuroimaging

Insula and striatum mediate the default bias

Humans are creatures of routine and habit. When faced with situations in which a default option is available, people show a consistent tendency to stick with the default. Why this occurs is unclear. To elucidate its neural basis, we used a novel gambling task in conjunction with functional magnetic resonance imaging. Behavioral results revealed that participants were more likely to choose the default card and felt enhanced emotional responses to outcomes after making the decision to switch.Weshow that increased tendency to switch away from the default during the decision phase was associated with decreased activity in the anterior insula; activation in this same area in reaction to "switching away from the default and losing" was positively related with experienced frustration. In contrast, decisions to choose the default engaged the ventral striatum, the same reward area as seen in winning. Our findings highlight aversive processes in the insula as underlying the default bias and suggest that choosing the default may be rewarding in itself. Copyright © 2010 the authors.

Differential encoding of losses and gains in the human striatum

Studies on human monetary prediction and decision making emphasize the role of the striatum in encoding prediction errors for financial reward. However, less is known about how the brain encodes financial loss. Using Pavlovian conditioning of visual cues to outcomes that simultaneously incorporate the chance of financial reward and loss, we show that striatal activation reflects positively signed prediction errors for both. Furthermore, we show functional segregation within the striatum, with more anterior regions showing relative selectivity for rewards and more posterior regions for losses. These findings mirror the anteroposterior valence-specific gradient reported in rodents and endorse the role of the striatum in aversive motivational learning about financial losses, illustrating functional and anatomical consistencies with primary aversive outcomes such as pain. Copyright © 2007 Society for Neuroscience.

Pain and self-preservation in autonomous robots: From neurobiological models to psychiatric disease

© 2017 IEEE. The use of biologically realistic (brain-like) control systems in autonomous robots offers two potential benefits. For neuroscience, it may provide important insights into normal and abnormal control and decision-making in the brain, by testing whether the computational learning and decision rules proposed on the basis of simple laboratory experiments lead to effective and coherent behaviour in complex environments. For robotics, it may offer new insights into control system designs, for example in the context of threat avoidance and self-preservation. In the brain, learning and decision-making for rewards and punishments (such as pain) are thought to involve integrated systems for innate (Pavlovian) responding, habit-based learning, and goal-directed learning, and these systems have been shown to be well-described by RL models. Here, we simulated this 3-system control hierarchy (in which the innate system is derived from an evolutionary learning model), and show that it reliably achieves successful performance in a dynamic predator-avoidance task. Furthermore, we show situations in which a 3-system architecture provides clear advantages over single or dual system architectures. Finally, we show that simulating a computational model of obsessive compulsive disorder, an example of a disease thought to involve a specific deficit in the integration of habit-based and goal-directed systems, can reproduce the results of human clinical experiments. The results illustrate how robotics can provide a valuable platform to test the validity and utility of computational models of human behaviour, in both health and disease. They also illustrate how bio-inspired control systems might usefully inform self-preservative behaviour in autonomous robots, both in normal and malfunctioning situations.

Parallel reward and punishment control in humans and robots: Safe reinforcement learning using the MaxPain algorithm

© 2017 IEEE. An important issue in reinforcement learning systems for autonomous agents is whether it makes sense to have separate systems for predicting rewards and punishments. In robotics, learning and control are typically achieved by a single controller, with punishments coded as negative rewards. However in biological systems, some evidence suggests that the brain has a separate system for punishment. Although this may in part be due to biological constraints of implementing negative quantities, it raises the question as to whether there is any computational rationale for keeping reward and punishment prediction operationally distinct. Here we outline a basic argument supporting this idea, based on the proposition that learning best-case predictions (as in Q-learning) does not always achieve the safest behaviour. We introduce a modified RL scheme involving a new algorithm which we call 'MaxPain' - which back-ups worst-case predictions in parallel, and then scales the two predictions in a multiattribute RL policy. i.e. independently learning 'what to do' as well as 'what not to do' and then combining this information. We show how this scheme can improve performance in benchmark RL environments, including a grid-world experiment and delayed version of the mountain car experiment. In particular, we demonstrate how early exploration and learning are substantially improved, leading to much 'safer' behaviour. In conclusion, the results illustrate the importance of independent punishment prediction in RL, and provide a testable framework for better understanding punishment (such as pain) and avoidance in humans, in both health and disease.

Primary functional brain connections associated with melancholic major depressive disorder and modulation by antidepressants

© 2020, The Author(s). The limited efficacy of available antidepressant therapies may be due to how they affect the underlying brain network. The purpose of this study was to develop a melancholic MDD biomarker to identify critically important functional connections (FCs), and explore their association to treatments. Resting state fMRI data of 130 individuals (65 melancholic major depressive disorder (MDD) patients, 65 healthy controls) were included to build a melancholic MDD classifier, and 10 FCs were selected by our sparse machine learning algorithm. This biomarker generalized to a drug-free independent cohort of melancholic MDD, and did not generalize to other MDD subtypes or other psychiatric disorders. Moreover, we found that antidepressants had a heterogeneous effect on the identified FCs of 25 melancholic MDDs. In particular, it did impact the FC between left dorsolateral prefrontal cortex (DLPFC)/inferior frontal gyrus (IFG) and posterior cingulate cortex (PCC)/precuneus, ranked as the second ‘most important’ FC based on the biomarker weights, whilst other eight FCs were normalized. Given that left DLPFC has been proposed as an explicit target of depression treatments, this suggest that the limited efficacy of antidepressants might be compensated by combining therapies with targeted treatment as an optimized approach in the future.

Towards prognostic functional brain biomarkers for cervical myelopathy: A resting-state fMRI study

© 2019, The Author(s). Recently, there has been increasing interest in strategies to predict neurological recovery in cervical myelopathy (CM) based on clinical images of the cervical spine. In this study, we aimed to explore potential preoperative brain biomarkers that can predict postoperative neurological recovery in CM patients by using resting-state functional magnetic resonance imaging (rs-fMRI) and functional connectivity (FC) analysis. Twenty-eight patients with CM and 28 age- and sex-matched healthy controls (HCs) underwent rs-fMRI (twice for CM patients, before and six months after surgery). A seed-to-voxel analysis was performed, and the following three statistical analyses were conducted: (i) FC comparisons between preoperative CM and HC; (ii) correlation analysis between preoperative FCs and clinical scores; and (iii) postoperative FC changes in CM. Our analyses identified three FCs between the visual cortex and the right superior frontal gyrus based on the conjunction of the first two analyses [(i) and (ii)]. These FCs may act as potential biomarkers for postoperative gain in the 10-second test and might be sufficient to provide a prediction formula for potential recovery. Our findings provide preliminary evidence supporting the possibility of novel predictive measures for neurological recovery in CM using rs-fMRI.

Values and actions in aversion

Model-based and model-free controllers can, in principle, learn arbitrary actions to optimize their behavior, at least those actions that can be expressed and explored. Indeed, these are often referred to as instrumental controllers because their choices are learned to be instrumental for the delivery of desired outcomes. Although this flexibility is very powerful, it comes with an attendant cost of learning. Evolution appears to have endowed everything from the simplest organisms to us with powerful, pre-specified, but inflexible alternatives. These responses are termed Pavlovian, after the famous Russian physiologist and psychologist Pavlov. The responses of the Pavlovian controller are determined by evolutionary (phylogenetic) considerations rather than (ontogenetic) aspects of the contingent development or learning of an individual. These responses directly interact with instrumental choices arising from goal-directed and habitual controllers. This interaction has been studied in a wealth of animal paradigms, and can be helpful, neutral, or harmful, according to circumstance. Although there has been less careful or analytical study of it in humans, it can be interpreted as underpinning a wealth of behavioral aberrations. © 2009 Elsevier Inc. All rights reserved.

Dread and the Disvalue of Future Pain

Standard theories of decision-making involving delayed outcomes predict that people should defer a punishment, whilst advancing a reward. In some cases, such as pain, people seem to prefer to expedite punishment, implying that its anticipation carries a cost, often conceptualized as 'dread'. Despite empirical support for the existence of dread, whether and how it depends on prospective delay is unknown. Furthermore, it is unclear whether dread represents a stable component of value, or is modulated by biases such as framing effects. Here, we examine choices made between different numbers of painful shocks to be delivered faithfully at different time points up to 15 minutes in the future, as well as choices between hypothetical painful dental appointments at time points of up to approximately eight months in the future, to test alternative models for how future pain is disvalued. We show that future pain initially becomes increasingly aversive with increasing delay, but does so at a decreasing rate. This is consistent with a value model in which moment-by-moment dread increases up to the time of expected pain, such that dread becomes equivalent to the discounted expectation of pain. For a minority of individuals pain has maximum negative value at intermediate delay, suggesting that the dread function may itself be prospectively discounted in time. Framing an outcome as relief reduces the overall preference to expedite pain, which can be parameterized by reducing the rate of the dread-discounting function. Our data support an account of disvaluation for primary punishments such as pain, which differs fundamentally from existing models applied to financial punishments, in which dread exerts a powerful but time-dependent influence over choice. © 2013 Story et al.

When is a loss a loss? Excitatory and inhibitory processes in loss-related decision-making

© 2015. One of the puzzles in neuroeconomics is the inconsistent pattern of brain response seen in the striatum during evaluation of losses. In some studies striatal responses appear to represent loss as a negative reward (BOLD deactivation), while in others as positive punishment (BOLD activation). We argue that these discrepancies can be explained by the existence of two fundamentally different types of loss: excitatory losses signaling the presence of substantive punishment, and inhibitory losses signaling cessation or omission of reward. We then map different theories of motivational opponency to loss related decision-making, and highlight five distinct underlying computational processes. We suggest that this excitatory-inhibitory model of loss provides a neurobiological framework for understanding reference dependence in behavioral economics.

The misbehavior of value and the discipline of the will

Most reinforcement learning models of animal conditioning operate under the convenient, though fictive, assumption that Pavlovian conditioning concerns prediction learning whereas instrumental conditioning concerns action learning. However, it is only through Pavlovian responses that Pavlovian prediction learning is evident, and these responses can act against the instrumental interests of the subjects. This can be seen in both experimental and natural circumstances. In this paper we study the consequences of importing this competition into a reinforcement learning context, and demonstrate the resulting effects in an omission schedule and a maze navigation task. The misbehavior created by Pavlovian values can be quite debilitating; we discuss how it may be disciplined. © 2006 Elsevier Ltd. All rights reserved.

Can, and should, behavioural neuroscience influence public policy?

Recent years have seen enormous demand amongst policy makers for new insights from the behavioural sciences, especially neuroscience. This demand is matched by an increasing willingness on behalf of behavioural scientists to translate the policy implications of their work. But can neuroscience really help shape the governance of a nation? Or does this represent growing misuse of neuroscience to attach scientific authority to policy, plus a clutch of neuroscientists trying to overstate their findings for a taste of power?. © 2012.

Choking on the money: Reward-based performance decrements are associated with midbrain activity

A pernicious paradox in human motivation is the occasional reduced performance associated with tasks and situations that involve larger-than-average rewards. Three broad explanations that might account for such performance decrements are attentional competition (distraction theories), inhibition by conscious processes (explicit-monitoring theories), and excessive drive and arousal (overmotivation theories). Here, we report incentive-dependent performance decrements in humans in a reward-pursuit task; subjects were less successful in capturing a more valuable reward in a computerized maze. Concurrent functional magnetic resonance imaging revealed that increased activity in ventral midbrain, a brain area associated with incentive motivation and basic reward responding, correlated with both reduced number of captures and increased number of near-misses associated with imminent high rewards. These data cast light on the neurobiological basis of choking under pressure and are consistent with overmotivation accounts. © 2009 Association for Psychological Science.

Choosing to make an effort: The role of striatum in signaling physical effort of a chosen action

The possibility that we will have to invest effort influences our future choice behavior. Indeed deciding whether an action is actually worth taking is a key element in the expression of human apathy or inertia. There is a well developed literature on brain activity related to the anticipation of effort, but how effort affects actual choice is less well understood. Furthermore, prior work is largely restricted to mental as opposed to physical effort or has confounded temporal with effortful costs. Here we investigated choice behavior and brain activity, using functional magnetic resonance imaging, in a study where healthy participants are required to make decisions between effortful gripping, where the factors of force (high and low) and reward (high and low) were varied, and a choice of merely holding a grip device for minimal monetary reward. Behaviorally, we show that force level influences the likelihood of choosing an effortful grip. We observed greater activity in the putamen when participants opt to grip an option with low effort compared with when they opt to grip an option with high effort. The results suggest that, over and above a nonspecific role in movement anticipation and salience, the putamen plays a crucial role in computations for choice that involves effort costs. Copyright © 2010 The American Physiological Society.

The role of human orbitofrontal cortex in value comparison for incommensurable objects

The human orbitofrontal cortex is strongly implicated in appetitive valuation. Whether its role extends to support comparative valuation necessary to explain probabilistic choice patterns for incommensurable goods is unknown. Using a binary choice paradigm, we derived the subjective values of different bundles of goods, under conditions of both gain and loss. We demonstrate that orbitofrontal activation reflects the difference in subjective value between available options, an effect evident across valuation for both gains and losses. In contrast, activation in dorsal striatum and supplementary motor areas reflects subjects' choice probabilities. These findings indicate that orbitofrontal cortex plays a pivotal role in valuation for incommensurable goods, a critical component process in human decision making. Copyright © 2009 Society for Neuroscience.

Decision-making in brains and robots — the case for an interdisciplinary approach

© 2019 Reinforcement Learning describes a general method for trial-and-error learning, and it has emerged as a dominant framework both for optimal control in autonomous robots, and understanding decision-making in the brain. Despite their common roots, however, these two fields have evolved largely independently. In this perspective, we consider how each now face problems that could potentially be addressed by insights from the other, and argue that an interdisciplinary approach could greatly accelerate progress in both.

Anchors, scales and the relative coding of value in the brain

People are alarmingly susceptible to manipulations that change both their expectations and experience of the value of goods. Recent studies in behavioral economics suggest such variability reflects more than mere caprice. People commonly judge options and prices in relative terms, rather than absolutely, and display strong sensitivity to exemplar and price anchors. We propose that these findings elucidate important principles about reward processing in the brain. In particular, relative valuation may be a natural consequence of adaptive coding of neuronal firing to optimise sensitivity across large ranges of value. Furthermore, the initial apparent arbitrariness of value may reflect the brains' attempts to optimally integrate diverse sources of value-relevant information in the face of perceived uncertainty. Recent findings in neuroscience support both accounts, and implicate regions in the orbitofrontal cortex, striatum, and ventromedial prefrontal cortex in the construction of value. © 2008 Elsevier Ltd. All rights reserved.

Modulation of pain ratings by expectation and uncertainty: Behavioral characteristics and anticipatory neural correlates

Expectations about the magnitude of impending pain exert a substantial effect on subsequent perception. However, the neural mechanisms that underlie the predictive processes that modulate pain are poorly understood. In a combined behavioral and high-density electrophysiological study we measured anticipatory neural responses to heat stimuli to determine how predictions of pain intensity, and certainty about those predictions, modulate brain activity and subjective pain ratings. Prior to receiving randomized laser heat stimuli at different intensities (low, medium or high) subjects (n = 15) viewed cues that either accurately informed them of forthcoming intensity (certain expectation) or not (uncertain expectation). Pain ratings were biased towards prior expectations of either high or low intensity. Anticipatory neural responses increased with expectations of painful vs. non-painful heat intensity, suggesting the presence of neural responses that represent predicted heat stimulus intensity. These anticipatory responses also correlated with the amplitude of the Laser-Evoked Potential (LEP) response to painful stimuli when the intensity was predictable. Source analysis (LORETA) revealed that uncertainty about expected heat intensity involves an anticipatory cortical network commonly associated with attention (left dorsolateral prefrontal, posterior cingulate and bilateral inferior parietal cortices). Relative certainty, however, involves cortical areas previously associated with semantic and prospective memory (left inferior frontal and inferior temporal cortex, and right anterior prefrontal cortex). This suggests that biasing of pain reports and LEPs by expectation involves temporally precise activity in specific cortical networks. © 2007 International Association for the Study of Pain.

The neurobiology of punishment

Animals, in particular humans, frequently punish other individuals who behave negatively or uncooperatively towards them. In animals, this usually serves to protect the personal interests of the individual concerned, and its kin. However, humans also punish altruistically, in which the act of punishing is personally costly. The propensity to do so has been proposed to reflect the cultural acquisition of norms of behaviour, which incorporates the desire to uphold equity and fairness, and promotes cooperation. Here, we review the proximate neurobiological basis of punishment, considering the motivational processes that underlie punishing actions.

The Effect of Motivation on Movement: A Study of Bradykinesia in Parkinson's Disease

The price of pain and the value of suffering

Estimating the financial value of pain informs issues as diverse as the market price of analgesics, the cost-effectiveness of clinical treatments, compensation for injury, and the response to public hazards. Such valuations are assumed to reflect a stable trade-off between relief of discomfort and money. Here, using an auction-based health-market experiment, we show that the price people pay for relief of pain is strongly determined by the local context of the market, that is, by recent intensities of pain or immediately disposable income (but not overall wealth). The absence of a stable valuation metric suggests that the dynamic behavior of health markets is not predictable from the static behavior of individuals. We conclude that the results follow the dynamics of habit-formation models of economic theory, and thus, this study provides the first scientific basis for this type of preference modeling. © 2009 Association for Psychological Science.

Search results

Found 8303 matches for