Published in final edited form as:
Published online 2010 Dec 24. doi: 10.1016/j.neubiorev.2010.12.012
To better understand the reward circuitry in human brain, we conducted activation likelihood estimation (ALE) and parametric voxel-based meta-analyses (PVM) on 142 neuroimaging studies that examined brain activation in reward-related tasks in healthy adults. We observed several core brain areas that participated in reward-related decision making, including the nucleus accumbens (NAcc), caudate, putamen, thalamus, orbitofrontal cortex (OFC), bilateral anterior insula, anterior (ACC) and posterior (PCC) cingulate cortex, as well as cognitive control regions in the inferior parietal lobule and prefrontal cortex (PFC). The NAcc was commonly activated by both positive and negative rewards across various stages of reward processing (e.g., anticipation, outcome, and evaluation). In addition, the medial OFC and PCC preferentially responded to positive rewards, whereas the ACC, bilateral anterior insula, and lateral PFC selectively responded to negative rewards. Reward anticipation activated the ACC, bilateral anterior insula, and brain stem, whereas reward outcome more significantly activated the NAcc, medial OFC, and amygdala. Neurobiological theories of reward-related decision making should therefore distributed and interrelated representations of reward valuation and valence assessment into account.
People face countless reward-related decision making opportunities everyday. Our physical, mental, and socio-economical well-being critically depends on the consequences of the choices we make. It is thus crucial to understand what underlies normal functioning of reward-related decision making. Studying the normal functioning of reward-related decision making also helps us to better understand the various behavioral and mental disorders which arise when such function is disrupted, such as depression (Drevets, 2001), substance abuse (Bechara, 2005; Garavan and Stout, 2005; Volkow et al., 2003), and eating disorders (Kringelbach et al., 2003; Volkow and Wise, 2005).
Functional neuroimaging research on reward has become a rapidly growing field. We have observed a huge surge of neuroimaging research in this domain, with dozens of relevant articles showing up in the PubMed database every month. On the one hand, this is exciting because the mounting results are paramount to formalizing behavioral and neural mechanisms of reward-related decision making (Fellows, 2004; Trepel et al., 2005). On the other hand, the heterogeneity of the results in conjunction with the occasional opposing patterns make it difficult to obtain a clear picture of the reward circuitry in human brain. The mixture of results is partly due to diverse experimental paradigms developed by various research groups that aimed to address different aspects of reward-related decision making, such as the distinction between reward anticipation and outcome (Breiter et al., 2001; Knutson et al., 2001b; McClure et al., 2003; Rogers et al., 2004), valuation of positive and negative rewards (Liu et al., 2007; Nieuwenhuis et al., 2005; O’Doherty et al., 2003a; O’Doherty et al., 2001; Ullsperger and von Cramon, 2003), and assessment of risk (Bach et al., 2009; d’Acremont and Bossaerts, 2008; Hsu et al., 2009; Huettel, 2006).
Therefore, it is crucial to pool existing studies together and examine the core reward networks in human brain, from both data-driven and theory-driven approaches to test the commonality and distinction of different aspects of reward-related decision making. To achieve this goal, we employed and compared two coordinate-based meta-analysis (CBMA) methods (Salimi-Khorshidi et al., 2009), activation likelihood estimation (ALE) (Laird et al., 2005; Turkeltaub et al., 2002) and parametric voxel-based meta-analysis (PVM) (Costafreda et al., 2009), so as to reveal the concordance across a large number of neuroimaging studies on reward-related decision making. We anticipated that the ventral striatum and orbitofrontal cortex (OFC), two major dopaminergic projection areas that have been associated with reward processing, would be consistently activated.
In addition, from a theory-driven perspective, we aimed to elucidate whether there exist distinctions in the brain networks that are responsible for processing positive and negative reward information, and that are preferentially involved in different stages of reward processing such as reward anticipation, outcome monitoring, and decision evaluation. Decision making involves encoding and representation of the alternative options and comparing the values or utilities associated with these options. Across these processes, decision making is usually affiliated with positive or negative valence from either the outcomes or emotional responses toward the choices made. Positive reward valence refers to the positive subjective states we experience (e.g., happiness or satisfaction) when the outcome is positive (e.g., winning a lottery) or better than we anticipate (e.g., losing less value than projected). Negative reward valence refers to the negative feelings we go through (e.g., frustration or regret) when the outcome is negative (e.g., losing a gamble) or worse than what we expect (e.g., stock value increasing lower than projected). Although previous studies have attempted to distinguish reward networks that are sensitive to processing positive or negative information (Kringelbach, 2005; Liu et al., 2007), as well as those that are involved in reward anticipation or outcome (Knutson et al., 2003; Ramnani et al., 2004), empirical results have been mixed. We aimed to extract consistent patterns by pooling over a large number of studies examining these distinctions.
2.1 Literature search and organization
2.1.1 Study identification
Two independent researchers conducted a thorough search of the literature for fMRI studies examining reward-based decision making in humans. The terms used to search the online citation indexing service PUBMED (through June 2009) were “fMRI”, “reward”, and “decision” (by the first researcher), “reward decision making task”, “fMRI”, and “human” (by the second researcher). These initial search results were merged to yield a total of 182 articles. Another 90 articles were identified from a reference database of a third researcher accumulated through June 2009 using “reward” and “MRI” as filtering criteria. We also searched the BrainMap database using Sleuth, with “reward task” and “fMRI” as search terms, and found 59 articles. All of these articles were pooled into a database and redundant entries were eliminated. We then applied several exclusion criteria to further eliminate articles that are not directly relevant to the current study. These criteria are: 1) non-first hand empirical studies (e.g., review articles); 2) studies that did not report results in standard stereotactic coordinate space (either Talairach or Montreal Neurological Institute, MNI); 3) studies using tasks unrelated to reward or value-based decision making; 4) studies of structural brain analyses (e.g., voxel-based morphometry or diffusion tensor imaging); 5) studies purely based on region of interest (ROI) analysis (e.g., using anatomical masks or coordinates from other studies); 6) studies of special populations whose brain functions may be deviated from those of normal healthy adults (e.g., children, aging adults, or substance dependent individuals), although coordinates reported in these studies for the healthy adult group alone were included. Variability among methods with which subjects were instructed to report decisions during the tasks (i.e., verbal, nonverbal button-press) was accepted. This resulted in 142 articles in the final database (listed in the Appendix).
During the data extraction stage, studies were then grouped by different spatial normalization schemes according to coordinate transformations implemented in the GingerALE toolbox (http://brainmap.org, Research Imaging Center of the University of Texas Health Science Center, San Antonio, Texas): using FSL to report MNI coordinates, using SPM to report MNI coordinates, using other programs to report MNI coordinates, using Brett methods to convert MNI coordinates into Talairach space, using a Talairach native template. Lists of coordinates that were in Talairach space were converted into the MNI space according to their original normalization schemes. For the Brett-Talairach list, we converted the coordinates back into the MNI space using reverse transformation by Brett (i.e., tal2mni)(Brett et al., 2002). For the native Talairach list, we used BrainMap’s Talairach-MNI transformation (i.e., tal2icbm_other). A master list of all studies was created by combining all coordinates in MNI space in preparation for the ALE meta-analyses in GingerALE.
2.1.2 Experiment categorization
To test hypotheses with regards to the common and distinct reward pathways that are recruited by different aspects of reward-related decision making, we categorized coordinates according to two types of classification: reward valence and decision stages. We adopted the term of “experiments” used by the BrainMap database to refer to individual regressors or contrasts typically reported in fMRI studies. For reward valence, we organized the experiments into positive and negative rewards. For decision stages, we separated the experiments into reward anticipation, outcome, and evaluation. Coordinates in the master list that fit into these categories were put into sub-lists; those that were difficult to interpret or not clearly defined were omitted. Below we list some examples that were put into each of these categories.
The following contrasts were classified as processing of positive rewards: those in which subjects won money or points (Elliott et al., 2000)(reward during run of success); avoided losing money or points (Kim et al., 2006)(direct comparison between avoidance of an averse outcome and reward receipt); won the larger of two sums of money or points (Knutson et al., 2001a)(large vs. small reward anticipation); lost the smaller of two sums of money or points (Ernst et al., 2005)(no-win $0.50 > no-win $4); received encouraging words or graphics on the screen(Zalla et al., 2000) (increase for “win”); received sweet taste in their mouths (O’Doherty et al., 2002)(glucose > neutral taste); positively evaluated the choice (Liu et al., 2007)(right > wrong), or received any other type of positive rewards as a result of successful completion of the task.
Experiments classified for negative rewards included those in which subjects lost money or points (Elliott et al., 2000)(penalty during run of failure); did not win money or points (Ernst et al., 2005)(dissatisfaction of no-win); won the smaller of two sums of money or points (Knutson et al., 2001a)($1 vs. $50 reward); lost the larger of two sums of money or points (Knutson et al., 2001a)(large vs. small punishment anticipation); negatively evaluated the choice (Liu et al., 2007)(wrong > right); or received any other negative rewards such as the administration of a bitter taste in their mouths (O’Doherty et al., 2002)(salt > neutral taste) or discouraging words or images (Zalla et al., 2000)(increase for “lose” and decrease for “win”).
Reward anticipation was defined as the time period when the subject was pondering potential options before making a decision. For example, placing a bet and expecting to win money on that bet would be classified as anticipation (Cohen and Ranganath, 2005)(high-risk vs. low-risk decision). Reward outcome/delivery was classified as the period when the subject received feedback on the chosen option, such as a screen with the words “win x$” or “lose x$” (Bjork et al., 2004)(gain vs. non-gain outcome). When the feedback influenced the subject’s decision and behavior in a subsequent trial or was used as a learning signal, the contrast was classified as reward evaluation. For example, a risky decision that is rewarded in the initial trial may lead a subject to take another, perhaps bigger, risk in the next trial (Cohen and Ranganath, 2005)(low-risk rewards followed by high-risk vs. low-risk decisions). Loss aversion, the tendency for people to strongly prefer avoiding losses to acquiring gains, is another example of evaluation (Tom et al., 2007)(relation between lambda and neural loss aversion).
2.2 Activation likelihood estimation (ALE)
The algorithm of ALE is based on (Eickhoff et al., 2009). ALE models the activation foci as 3D Gaussian distributions centered at the reported coordinates, and then calculates the overlap of these distributions across different experiments (ALE treats each contrast in a study as a separate experiment). The spatial uncertainty associated with activation foci is estimated with respect to the number of subjects in each study (i.e., a larger sample produces more reliable activation patterns and localization; therefore the coordinates are convolved with a tighter Gaussian kernel). The convergence of activation patterns across experiments is calculated by taking the union of the above modeled activation maps. A null distribution that represents ALE scores generated by random spatial overlap across studies is estimated through permutation procedure. Finally the ALE map computed from the real activation coordinates is tested against the ALE scores from the null distribution, producing a statistical map representing the p values of the ALE scores. The nonparametric p values are then transformed into z scores and thresholded at a cluster-level corrected p<0.05.
Six different ALE analyses were conducted using GingerALE 2.0 (Eickhoff et al., 2009), one for the main analysis of all studies, and one for each of the five sub-lists characterizing brain activation by positive or negative rewards as well as anticipation, outcome, and evaluation. Two subtraction ALE analyses were conducted using GingerALE 1.2 (Turkeltaub et al., 2002), one for the contrast between positive and negative rewards, and the other for the contrast between anticipation and outcome.
2.2.1 Main analysis of all studies
All 142 studies were included in the main analysis, which consisted of 5214 foci from 655 experiments (contrasts). We used the algorithm implemented in GingerALE 2.0, which models the ALE based on the spatial uncertainty of each focus using an estimation of the inter-subject and inter-experiment variability. The estimation was constrained by a gray matter mask and estimated the above-chance clustering with the experiments as a random-effects factor, rather than using a fixed-effects analysis on foci (Eickhoff et al., 2009). The resulting ALE map was thresholded using the false discover rate (FDR) method with p<0.05 and a minimum cluster size of 60 voxels of 2×2×2 mm (for a total of 480 mm3) to protect against false positives of multiple comparisons.
2.2.2 Individual analyses of sub-lists
Five other ALE analyses were also conducted based on the sub-lists that categorize different experiments into positive and negative rewards, as well as reward anticipation, reward delivery (outcome), and choice evaluation. For the positive reward analysis, 2167 foci from 283 experiments were included. The negative reward analysis consisted of 935 foci from 140 experiments. The numbers of foci included in the analyses for anticipation, outcome, and choice evaluation were 1553 foci (185 experiments), 1977 (253), and 520 (97), respectively. We applied the same analysis and threshold approaches as we did for the main analysis above.
2.2.3 Subtraction analyses
We were also interested in contrasting the brain areas that were selectively or preferentially activated by positive versus negative rewards, and by reward anticipation versus reward delivery. GingerALE 1.2 was used to conduct these two analyses. ALE maps were smoothed with a kernel with a FWHM of 10 mm. A permutation test of randomly distributed foci with 10000 simulations was run to determine statistical significance of the ALE maps. To correct for multiple comparisons, the resulting ALE maps were thresholded using the FDR method with p<0.05 and a minimum cluster size of 60 voxels.
2.3 Parametric voxel-based meta-analysis (PVM)
We also analyzed the same coordinate lists using another meta-analysis approach, PVM. In contrast to the ALE analysis, which treats different contrasts within a study as distinct experiments, PVM analysis pools peaks from all different contrasts within a study and creates a single coordinate map for the specific study (Costafreda et al., 2009). Therefore, the random-effects factor in the PVM analysis is the studies, in comparison to individual experiments/contrasts in the ALE analysis. This further reduces estimation bias caused by studies with multiple contrasts that reporting similar activation patterns. Similar to the ALE approach, we conducted six different PVM analyses using the algorithms implemented in R statistical software (http://www.R-project.org) from a previous study (Costafreda et al., 2009), one for the main analysis of all studies, and one for each of the five sub-lists characterizing brain activation by different aspects of reward processing. Two additional PVM analyses were conducted using the same code base to compare between positive and negative rewards as well as between reward anticipation and outcome.
2.3.1 Main analysis of all studies
MNI coordinates (5214) from the same 142 studies used in the ALE analysis were transformed into a text table, with each study identified by a unique study identification label. Computations on the peak map were constrained within a mask in MNI space. The peak map was first smoothed with a uniform kernel (ρ = 10 mm) to generate the summary map, which represents the number of studies reporting overlapping activation peaks within a neighborhood of 10 mm radius. Next, random-effects PVM analysis was run to estimate statistical significance associated with each voxel in the summary map. The number of studies in the summary map was converted into the proportion of studies that reported concordant activation. We used the same threshold as used in ALE analysis to identify significant clusters for the proportion map (using the FDR method with p<0.05 and a minimum cluster size of 60 voxels).
2.3.2 Individual analyses of sub-lists
Five other PVM analyses were conducted on the sub-lists for positive and negative rewards, as well as reward anticipation, outcome, and evaluation. The positive reward analysis included 2167 foci from 111 studies whereas the negative reward analysis included 935 foci from 67 studies. The numbers of studies included in the analyses for anticipation, outcome, and choice evaluation were 1553 foci (65 studies), 1977 (86), and 520 (39), respectively. We applied the same analysis and threshold approaches as we did for the main analysis above.
2.3.3 Comparison analyses
We also conducted two PVM analyses to compare the activation patterns between positive and negative rewards as well as between reward anticipation and outcome. Two peak maps (e.g., one for positive and the other for negative) were first smoothed with a uniform kernel (ρ = 10 mm) to generate the summary maps, each representing the number of studies with overlapping activation peak within a neighborhood of 10 mm radius. These two summary maps were entered into a Fisher test to estimate the odds ratio and statistical significance p value for each contributing voxel within the MNI space mask. Since the Fisher test is not specifically developed for fMRI data analysis and empirically less sensitive than the other methods, we applied a relatively lenient threshold for the direct comparison PVM analysis, using uncorrected p<0.01 and a minimum cluster size of 60 voxels (Xiong et al., 1995), to correct for multiple comparison Type I error.
3.1 ALE results
The all-inclusive analysis of 142 studies showed significant activation of a large cluster that encompassed the bilateral nucleus accumbens (NAcc), pallidum, anterior insula, lateral/medial OFC, anterior cingulate cortex (ACC), supplementary motor area (SMA), lateral prefrontal cortex (PFC), right amygdala, left hippocampus, thalamus, and brain stem (Figure 1A). Other smaller clusters included the right middle frontal gyrus and left middle/inferior frontal gyrus, bilateral inferior/superior parietal lobule, and posterior cingulate cortex (PCC) (Table 1).
Positive rewards activated a subset of the above mentioned networks, including the bilateral pallidum, anterior insula, thalamus, brain stem, medial OFC, ACC, SMA, PCC, and other frontal and parietal areas (Figure 1B and Table 2, also see Supplementary Materials – Figure S1A). Negative rewards showed activation in the bilateral NAcc, caudate, pallidum, anterior insula, amygdale, thalamus, brain stem, rostral ACC, dorsomedial PFC, lateral OFC, and right middle and inferior frontal gyrus (Figure 1B and Table 2, also see Supplementary Materials – Figure S1B). Contrasting activation by positive versus negative rewards, we found that positive rewards significantly activated the following regions to a great degree: bilateral NAcc, anterior insula, medial OFC, hippocampus, left putamen, and thalamus (Figure 1D and Table 4). None showed more activation by negative than positive rewards.
Different reward processing stages shared similar brain activation patterns in the above-mentioned core networks, including the bilateral NAcc, anterior insula, thalamus, medial OFC, ACC, and dorsomedial PFC (Figure 1C and Table 3, also see Supplementary Materials – Figures S1C–E). Reward anticipation, as compared to reward outcome, revealed greater activation in the bilateral anterior insula, ACC, SMA, left inferior parietal lobule and middle frontal gyrus (Figure 1E and Table 5). Outcome preferential activation included bilateral NAcc, caudate, thalamus, and medial/lateral OFC (Table 5).
3.2 PVM results
The main analysis of 142 studies showed significant activation in bilateral NAcc, anterior insula, lateral/medial OFC, ACC, PCC, inferior parietal lobule, and middle frontal Gyrus (Figure 2A and Table 6).
Positive rewards activated the bilateral NAcc, pallidum, putamen, thalamus, medial OFC, pregenual cingulate cortex, SMA, and PCC (Figure 2B and Table 7, also see Supplementary Materials – Figure S2A). Activation by negative rewards was found in the bilateral NAcc and anterior insula, pallidum, ACC, SMA, and middle/inferior frontal gyrus (Figure 2B and Table 7, also see Supplementary Materials – Figure S2B). Direct contrast between positive and negative rewards revealed preferential activation by positive rewards in the NAcc, pallidum, medial OFC, and PCC, and greater activation by negative rewards in ACC and middle/inferior frontal gyrus (Figure 2D and Table 9).
Different reward processing stages similarly activated the NAcc and ACC whereas they differentially recruited other brain areas such as medial OFC, anterior insula, and amygdala (Figure 2C and Table 8, also see Supplementary Materials – Figure S2C–E). Reward anticipation, as compared to reward outcome, revealed significant activation in the bilateral anterior insula, thalamus, precentral gyrus, and inferior parietal lobule (Figure 2E and Table 10). No brain area showed greater activation by reward outcome in comparison to anticipation.
3.3 Comparison of ALE and PVM results
The current study also showed that although ALE and PVM methods treated the coordinate-based data differently and adopted distinct estimation algorithms, the results for a single list of coordinates from these two meta-analysis approaches were very similar and comparable (Figures 1A–C and 2A–C, Table 11, also see Figures S1 and S2 in the Supplementary Materials). The improved ALE algorithm implemented in GingerALE 2.0, by design, treats experiments (or contrasts) as the random-effects factor, which significantly reduces the bias caused by experiments reporting more loci versus those with fewer loci. Different studies, however, include different number of experiments/contrasts. Therefore, the results of GingerALE 2.0 may still be affected by the bias that weighs more toward studies reporting more contrasts, potentially overestimating cross-study concordance. However, by choice, users can combine coordinates from different contrasts together so that GingerALE 2.0 can treat each study as a single experiment. This is what PVM implements, pooling coordinates from all contrasts within a study into a single activation map, thus weighing all studies equally to estimate activation overlap across studies.
In contrast, comparison of two lists of coordinates differed significantly between ALE and PVM approaches (Table 11), as a result of their differences in sensitivity to within-study and cross-study convergence. Since the improved ALE algorithm has not been implemented for the subtractive ALE analysis, we used an earlier version, GingerALE 1.2, which treats the coordinates as the random-effects factor and experiments as the fixed-effects variable. Therefore differences in both the numbers of coordinates and experiments in two lists may affect the subtraction results. The subtractive ALE analysis biased toward the list with more experiments against the other with fewer (Figure 1D/E). Positive reward studies (2167 foci from 283 experiments) clearly predominated over negative studies (935 foci from 140 experiments). The difference between reward anticipation (1553 foci from 185 experiments) and outcome (1977 foci from 253 experiments) was smaller, but could have also caused the bias toward the outcome phase. On the other hand, the use of the Fisher test to estimate the odds ratio and assign voxels in one of the two lists by PVM seemed to be less sensitive in detecting activation difference between the two lists (Figure 2D/E).
We are constantly making decisions in our everyday life. Some decisions involve no apparent positive or negative values of the outcomes whereas others have significant impacts on the valence of the results and our emotional responses toward the choices we make. We may feel happy and satisfied when the outcome is positive or our expectation is fulfilled, or feel frustrated when the outcome is negative or lower than what we anticipated. Moreover, many decisions must be made without advance knowledge of their consequences. Therefore, we need to be able to make predictions about the future reward, and evaluate the reward value and potential risk of obtaining it or being penalized. This requires us to evaluate the choice we make based on the presence of prediction errors and to use these signals to guide our learning and future behaviors. Many neuroimaging studies have examined reward-related decision making. However, given the complex and heterogeneous psychological processes involved in value-based decision making, it is no trivial task to examine neural networks that subserve representation and processing of reward-related information. We have observed a rapid growth in the number of empirical studies in the field of neuroeconomics, yet thus far it has been hard to see how these studies have converged so as to clearly delineate the reward circuitry in the human brain. In the current meta-analysis study, we have showed concordance across a large number of studies and revealed the common and distinct patterns of brain activation by different aspects of reward processing. In a data-driven fashion, we pooled over all coordinates from different contrasts/experiments of 142 studies, and observed a core reward network, which consists of the NAcc, lateral/medial OFC, ACC, anterior insula, dorsomedial PFC, as well as the lateral frontoparietal areas. A recent meta-analysis study focusing on risk assessment in decision making reported a similar reward circuitry (Mohr et al., 2010). In addition, from a theory-driven perspective, we contrasted neural networks that were involved in positive and negative valence across anticipation and outcome stages of reward processing, and elucidated distinct neural substrates subserving valence-related assessment as well as their preferential involvement in anticipation and outcome.
4.1 Core reward areas: NAcc and OFC
The NAcc and OFC have long been conceived as the major players in reward processing because they are the main projection areas of two distinct dopaminergic pathways, the mesolimbic and mesocortical pathways, respectively. However, it remains unknown how dopamine neurons distinctively modulate activity in these limbic and cortical areas. Previous studies have tried to differentiate the roles of these two structures in terms of temporal stages, associating the NAcc with reward anticipation and relating the medial OFC to receipt of reward (Knutson et al., 2001b; Knutson et al., 2003; Ramnani et al., 2004). Results from other studies questioned such a distinction (Breiter et al., 2001; Delgado et al., 2005; Rogers et al., 2004). Many studies also implied that the NAcc was responsible for detecting prediction error, a crucial signal in incentive learning and reward association (McClure et al., 2003; O’Doherty et al., 2003b; Pagnoni et al., 2002). Studies also found that the NAcc showed a biphasic response, such that activity in the NAcc would decrease and drop below the baseline in response to negative prediction errors (Knutson et al., 2001b; McClure et al., 2003; O’Doherty et al., 2003b). Although the OFC usually displays similar patterns of activity as the NAcc, previous neuroimaging studies in humans have suggested that the OFC serves to convert a variety of stimuli into a common currency in terms of their reward values (Arana et al., 2003; Cox et al., 2005; Elliott et al., 2010; FitzGerald et al., 2009; Gottfried et al., 2003; Kringelbach et al., 2003; O’Doherty et al., 2001; Plassmann et al., 2007). These findings paralleled those obtained from single cell recording and lesion studies in animals (Schoenbaum and Roesch, 2005; Schoenbaum et al., 2009; Schoenbaum et al., 2003; Schultz et al., 2000; Tremblay and Schultz, 1999, 2000; Wallis, 2007).
Our overall analyses showed that the NAcc and OFC responded to general reward processing (Figure 1A and Figure 2A). Activation in the NAcc largely overlapped across different stages, whereas the medial OFC was more tuned to reward receipt (Figure 1C/E and Figure 2C). These findings highlighted that the NAcc may be responsible for tracking both positive and negative signals of reward and using them to modulate learning of reward association, whereas the OFC mostly monitors and evaluates reward outcomes. Further investigation is needed to better differentiate the roles of the NAcc and OFC in reward-related decision making (Frank and Claus, 2006; Hare et al., 2008).
4.2 Valence-related assessment
In addition to converting various reward options into common currency and representing their reward values, distinct brain regions in the reward circuitry may separately encode positive and negative valences of reward. Direct comparisons across reward valence revealed that both the NAcc and medial OFC were more active in response to positive versus negative rewards (Figure 1B/D and Figure 2B/D). In contrast, the anterior insular cortex was involved in the processing of negative reward information (Figure 1B and Figure 2B). These results confirmed the medial-lateral distinction for positive versus negative rewards (Kringelbach, 2005; Kringelbach and Rolls, 2004), and were consistent with what we observed in our previous study on a reward task (Liu et al., 2007). Sub-regions of the ACC uniquely responded to positive and negative rewards. Pregenual and rostral ACC, close to the medial OFC, were activated by positive rewards whereas the caudal ACC responded to negative rewards (Figure 1B and Figure 2B). ALE and PVM meta-analyses also revealed that the PCC was consistently activated by positive rewards (Figure 1B and Figure 2B).
Interestingly, separate networks encoding positive and negative valences are similar to the distinction between two anti-correlated networks, the default-mode network and task-related network (Fox et al., 2005; Raichle et al., 2001; Raichle and Snyder, 2007). Recent meta-analyses found that the default-mode network mainly involved the medial prefrontal regions (including the medial OFC) and medial posterior cortex (including the PCC and precuneus), and the task-related network includes the ACC, insula, and lateral frontoparietal regions (Laird et al., 2009; Toro et al., 2008). Activation in the medial OFC and PCC by positive rewards mirrored the default-mode network commonly observed during the resting state, whereas activation in the ACC, insula, lateral prefrontal cortex by negative rewards paralleled the task-related network. This intrinsic functional organization of the brain was found to influence reward and risky decision making and account for individual differences in risk-taking traits (Cox et al., 2010).
4.3 Anticipation versus outcome
The bilateral anterior insula, ACC/SMA, inferior parietal lobule, and brain stem showed more consistent activation in anticipation in comparison to the outcome phase (Figure 1C/E and Figure 2C/E). The anterior insula and ACC have previously been implicated in interoception, emotion and empathy (Craig, 2002, 2009; Gu et al., 2010; Phan et al., 2002), and risk and uncertainty assessment (Critchley et al., 2001; Kuhnen and Knutson, 2005; Paulus et al., 2003), lending its role in anticipation. The anterior insula was consistently involved in risk processing, especially in anticipation of loss, as revealed by a recent meta-analysis (Mohr et al., 2010). Similar to the role of the OFC, the parietal lobule has been associated with valuation of different options (Sugrue et al., 2005), numerical representation (Cohen Kadosh et al., 2005; Hubbard et al., 2005), and information integration (Gold and Shadlen, 2007; Yang and Shadlen, 2007). Therefore, it is crucial for the parietal lobule to be involved in the anticipation stage of reward processing so as to plan and prepare for an informed action (Andersen and Cui, 2009; Lau et al., 2004a; Lau et al., 2004b).
On the other hand, the ventral striatum, medial OFC, and amygdala showed preferential activation during reward outcome in comparison to the anticipation stage (Figure 1C/E and Figure 2C). These patterns were consistent with what we and other investigators found previously (Breiter et al., 2001; Delgado et al., 2005; Liu et al., 2007; Rogers et al., 2004), standing against the functional dissociation between the ventral striatum and medial OFC in terms of their respective roles in reward anticipation and reward outcome (Knutson et al., 2001a; Knutson et al., 2001b; Knutson et al., 2003).
4.4 A schematic illustration of reward processing
Based on the findings of common and distinct networks involved in various aspects of reward decision making, we have come up with a schematic illustration to summarize the distributed representations of valuation and valence in reward processing (Figure 3). We tentatively group different brain regions based on their roles in different processes, although each region may serve multiple functions and interact with other brain areas in a far more complex way. When facing alternative choices, each of which has distinctive characteristics such as magnitude and probability, these properties need to be converted into comparable value-based information, a “common currency”. Not only do we compare the values of these alternative choices, but we also compare the factual and projected values as well as the fictional values associated with the un-chosen choice (e.g., the prediction error signal). The ventral striatum and medial OFC have been implicated in this value-based representation. The inferior parietal lobule has also been found to be involved in representing and comparing numerical information. In addition, value-based decision making inevitably results in evaluation of the choices, based on the valence of the outcomes and associated emotional responses. While the ventral striatum and medial OFC are also involved in detecting the positive reward valence, the lateral OFC, anterior insula, ACC and amygdala are mostly implicated in processing of the negative reward valence, most likely linked to their evaluative roles in negative emotional responses. Because of the negative affect usually associated with risk, the anterior insula and ACC are also involved in reward anticipation of risky decisions, especially for uncertainty-averse responses in anticipation of loss. Finally, the frontoparietal regions serve to integrate and act upon these signals in order to produce optimal decisions (e.g., win-stay-loss-switch).
A couple of methodological caveats need to be noted. The first is related to the bias in reporting the results in different studies. Some studies are purely ROI-based, which were excluded from the current study. Still, others singled out or put more emphasis on a prior regions by reporting more coordinates or contrasts related to those regions. They could bias the results toward confirming the “hotspots”. Secondly, we want to caution about conceptual distinction of different aspects of reward processing. We classified various contrasts into different categories of theoretical interest. However, with real life decisions or in many experimental tasks, these aspects do not necessarily have clear divisions. For example, evaluation of the previous choice and reward outcome may intermingle with upcoming reward anticipation and decision making. There is no clear boundary across different stages of reward processing, leaving our current classification open for discussion. Nonetheless, this hypothesis-driven approach is greatly needed (Caspers et al., 2010; Mohr et al., 2010; Richlan et al., 2009), which complements the data-driven nature of meta-analysis. Many factors related to reward decision making, such as risk assessment and types of reward (e.g., primary vs. secondary, monetary vs. social), call for additional meta-analyses.
We conducted two sets of coordinate-based meta-analyses on 142 fMRI studies of reward.
The core reward circuitry included the nucleus accumbens, insula, orbitofrontal, cingulate, and frontoparietal regions.
The nucleus accumbens was activated by both positive and negative rewards across various reward processing stages.
Other regions showed preferential responses toward positive or negative rewards, or during anticipation or outcome.
This study is supported by the Hundred-Talent Project of the Chinese Academy of Sciences, NARSAD Young Investigator Award (XL), and NIH Grant R21MH083164 (JF). The authors wish to thank the development team of BrainMap and Sergi G. Costafreda for providing excellent tools for this study.
List of articles included in the meta-analyses of the current study.
Author contributions: XL designed and supervised the whole study. JH and MS made equal contributions to this study, performing literature search, data extraction and organization. JF participated in discussion and manuscript preparation.
Publisher’s Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.