A unique adolescent response to reward prediction errors (2010)

Nat Neurosci. 2010 Jun;13(6):669-71. Epub 2010 May 16.

 

Source

Department of Psychology, University of California Los Angeles, Los Angeles, California, USA. [email protected]

Abstract

Previous work has shown that human adolescents may be hypersensitive to rewards, but it is not known which aspect of reward processing is responsible for this. We separated decision value and prediction error signals and found that neural prediction error signals in the striatum peaked in adolescence, whereas neural decision value signals varied depending on how value was modeled. This suggests that heightened dopaminergic prediction error responsivity contributes to adolescent reward seeking.

Adolescence is a unique period in psychological development, characterized by increased risky choices and actions as compared to children and adults. This may reflect the relatively early functional development of limbic affective and reward systems in comparison to prefrontal cortex1, such that adolescents tend to make poor decisions and risky choices more often than both children (who are not yet fully sensitive to rewards) and adults (who are sensitive to rewards, but have the ability to exert control over reward-driven urges).

According to behavioral decision theories, choices are driven by the value assigned to each potential choice (decision value)2. Decision value is computed by a system in the medial prefrontal cortex that serves as a common pathway for value representation3,4. However, in order to behave adaptively in a changing or noisy world, these values must be updated based on experience. Reward prediction error signals reflect the difference between the expected value of an action and the actual outcome of the action5, and are coded by phasic activity in the mesolimbic dopamine system6. In fMRI, they are usually observed in the ventral striatum, reflecting dopaminergic output(e.g., 7). The nature of prediction error signals in children or adolescents is unknown. Adolescents may have a hypersensitive striatal response to reward8, although this finding is somewhat inconsistent9,10. We examined whether adolescence is associated with unique changes in either decision value or prediction error signals, using a probabilistic learning paradigm11 (Fig. 1; see Supplementary Methods online). We estimated both decision value and prediction error signals on each trial during learning using a simple learning model5. Using parametric fMRI analyses, we identified brain regions whose response was modulated in accordance with these signals, and examined how this response changed with age from childhood to adulthood. We examined both linear effects (which reflect general maturational or developmental trends) and quadratic effects (which reflect adolescent-specific effects) with age. This work represents the first examination of these subcomponents of decision-making across development.

Figure 1

Experimental design. 45 healthy participants (18 children aged 8–12, 16 adolescents aged 14–19, and 11 adults aged 25–30) performed a probabilistic learning task during fMRI acquisition. Written informed consent was obtained. Participants

Behaviorally, all participants became more accurate and faster with training for predictable stimuli, but not for random stimuli (interaction F(5,210) = 9.85, P< 0.0001 for accuracy and F(5,210) = 6.60, P< 0.0001 for response times; Supplementary Table 1 and Fig. 1 online). Crucially, there was a reward x age interaction for response times (F(2,42) = 5.03, P = 0.01). Post-hoc tests showed that adolescents were the only age group to respond significantly more quickly to stimuli associated with large rewards as compared to small rewards (t(15) = 3.24, P = 0.006; for children t(17) = −0.32, P = 0.75 and for adults t(10) = 1.90, P = 0.09).

We modeled the fMRI data to allow separate estimation of the neural responses to stimulus and feedback (Supplementary Methods and Fig. 2 online; for whole-brain main effects of viewing the stimuli and receiving feedback about responses, see Supplementary Figs. 3–4 and Tables 2–3 online). We examined how neural correlations with model-based decision signals (decision value and prediction error) were related to age.

Figure 2

MRI results. (a) Regions showing correlations with age when correcting at the whole-brain level at z > 2.3, P< 0.05. The striatal and angular gyrus regions were negatively correlated with age2; because the mean age2 was subtracted from

We analyzed quadratic trends in positive prediction error at feedback and identified two regions in which adolescents displayed a hypersensitive response as compared to the other age groups–the striatum and the angular gyrus. An area in the medial prefrontal cortex showed a negative linear effect of age on stimulus decision value, such that younger participants had a stronger decision value signal in this region as compared to older participants; this region has been strongly associated with goal-oriented stimulus value in previous work in adults (Fig. 2a)12. Thus, whereas response to unpredictable positive feedback peaked in adolescence, sensitivity to stimulus value decreased linearly with age (for plots between age and each of the above regions of interest [ROIs], see Supplementary Fig. 5 online).

Given that decision value develops through error-driven learning in the model, it was surprising that decision value showed a different age-related trajectory than prediction error. However, due to the structure of the task, it is possible that choice was driven by other factors beyond reinforcement learning (e.g., explicit memory). To clarify the results we ran a second model that computed decision value in a more integrative fashion as the proportion of previous trials on which the optimal response was chosen for each stimulus (Lin, Adolphs & Rangel, unpublished; Supplementary Methods online). We analyzed prediction error values from this model and found that they mirrored the results of our initial analyses, showing regions in the striatum and parietal cortex, along with ventral lateral prefrontal regions, where neural response to prediction error peaked in adolescence. Analysis of decision value from this model showed both linear and nonlinear relationships between age and neural activity in a number of regions, including the lateral parietal cortex and striatum (Supplementary Fig. 6 and Table 5 online). Exploratory (non-independent) ROI analyses showed that the neural response to decision value in this model appeared to increase between childhood and adolescence, but then asymptoted between adolescence and adulthood (Supplementary Fig. 7 online). These results demonstrated that the peak prediction error response in adolescence was robust to different models, whereas age-related changes in decision value signals were sensitive to model specification.

Based on previous work showing that the ventral striatum is consistently sensitive to unexpected positive feedback, as reflected in model-based reward prediction error signals(e.g.,7), we examined the localization of prediction error-related responses for each age group separately within an independent anatomical ROI including the bilateral caudate, putamen, and nucleus accumbens using the original reinforcement learning model (Fig. 2b). Striatal regions significantly related to positive prediction error did not overlap for adolescents and adults. While adults in this study had activity in the ventral striatal region consistently seen in studies examining prediction error in adults, adolescents had activity in a more dorsal region. Children had no activity in the striatum related to positive prediction error.

Our results extend previous findings of increased reward-related neural activity during adolescence8 by demonstrating that this finding is specific to prediction error, as compared to valuation signals. The developmental differences in prediction error response likely reflect differences in phasic dopamine signaling13. If correct, this provides a direct explanation for the risky reward-seeking behavior often observed in adolescents. The increased risky behavior in adolescence could in theory reflect either a decreased sensitivity to potential negative outcomes or an increased sensitivity to potential positive outcomes. We believe that our data are consistent with the latter: that is, increased prediction error signals (putatively reflecting greater phasic dopamine signals) reflect greater impact of positive outcomes14, which is proposed to result in an increased motivation to obtain positive outcomes (and thus greater risk-taking). Thus, an overactive dopaminergic prediction error response in adolescents could result in an increase in reward-seeking, particularly when coupled with an immature cognitive control system1.

The present findings may shed light on why previous studies have yielded inconsistent effects of age on reward processing. First, not all studies compared adolescents to both children and adults, meaning that the possibility of nonlinear relationships with age could not be noted. Further, the definition of “adolescent” has not been consistent across studies. Second, it is important to note that the probabilistic learning task used here was not a risky decision making task per se, thus is different from other tasks used in the reward and risk-taking literature. Third, our results suggest that a proper understanding of developmental changes in reward processing requires the use of model-based approaches along with decomposition of individual trial components (stimulus, choice, and feedback).

It is increasingly realized that adolescence is a unique period in psychological development, and that the risky, reward-seeking behavior that occurs during this period can result in significant morbidity and mortality, including accidental death and the onset of drug addiction. Thus, understanding the neural basis of adolescent decision-making is a critical challenge. The present work suggests that one contributor to adolescent reward-seeking may be the presence of enhanced prediction error signals, which provides a novel target for future studies of this important period in development.

Supplementary Material

Acknowledgments

This research was supported by the National Institute of Mental Health (5R24 MH072697), the National Institute of Drug Abuse (5F31 DA024534), the McDonnell Foundation, and the Della Martin Foundation.

Footnotes

Author Contributions J.R.C. helped design the experiments, conducted data acquisition and analyses, and wrote the manuscript. R.F.A., R.M.B., and S.Y.B. designed the experiments. F.W.S. contributed to data acquisition. B.J.K. and R.A.P. designed the experiments and helped write the manuscript.

 

Competing Interests The authors declare that they have no competing financial interests.

References

1. Casey BJ, Getz S, Galvan A. Dev Rev. 2008;28:62–77. [PMC free article] [PubMed]
2. Kahneman D, Tversky A. Econometrica. 1979;47:263–91.
3. Chib VS, Rangel A, Shimojo S, O’Doherty JP. J Neurosci. 2009;29:12315–20. [PubMed]
4. Tom SM, Fox CR, Trepel C, Poldrack RA. Science. 2007;315:515–8. [PubMed]
5. Rescorla RA, Wagner AR. In: Classical Conditioning II: Current Research and Theory. Black A, Prokasy WF, editors. Appleton Century Crofts; New York, NY: 1972. pp. 64–99.
6. Schultz W, Dayan P, Montague PR. Science. 1997;275:1593–9. [PubMed]
7. Pagnoni G, Zink CF, Montague PR, Berns GS. Nat Neurosci. 2002;5:97–8. [PubMed]
8. Galvan A, et al. J Neurosci. 2006;26:6885–92. [PubMed]
9. Bjork JM, et al. J Neurosci. 2004;24:1793–802. [PubMed]
10. May JC, et al. Biol Psychiatry. 2004;55:359–66. [PubMed]
11. Knowlton BJ, Mangels JA, Squire LR. Science. 1996;273:1399–402. [PubMed]
12. Hare TA, O’Doherty J, Camerer CF, Schultz W, Rangel A. J Neurosci. 2008;28:5623–30. [PubMed]
13. D’Ardenne K, McClure SM, Nystrom LE, Cohen JD. Science. 2008;319:1264–7. [PubMed]
14. Berridge KC, Robinson TE. Brain Res Rev. 1998;28:309–69. [PubMed]