Dissociable Contributions by Prefrontal D1 and D2 Receptors to Risk Based Decision Making (2011)

J Neurosci. 2011 Jun 8;31(23):8625-33.

St Onge JR, Abhari H, Floresco SB.

Source

Department of Psychology and Brain Research Center, University of British Columbia, Vancouver, British Columbia V6T 1Z4, Canada

Abstract

Choices between certain and uncertain rewards of different magnitudes have been proposed to be mediated by both the frontal lobes and the mesocorticolimbic dopamine (DA) system. In rats, systemic manipulations of DA activity or inactivation of the medial prefrontal cortex (PFC) disrupt decision making about risks and rewards. However, it is unclear how PFC DA transmission contributes to these processes. We addressed this issue by examining the effects of pharmacological manipulations of D1 and D2 receptors in the medial (prelimbic) PFC on choice between small, certain and large, yet probabilistic rewards. Rats were trained on a probabilistic discounting task where one lever delivered one pellet with 100% probability, and the other delivered four pellets, but the probability of receiving reward decreased across blocks of trials (100, 50, 25, 12.5%). D1 blockade (SCH23390) in the medial PFC decreased preference for the large/risky option. In contrast, D2 blockade (eticlopride) reduced probabilistic discounting and increased risky choice. The D1 agonist SKF81297 caused a slight, nonsignificant increase in preference for the large/risky lever. However, D2 receptor stimulation (quinpirole) induced a true impairment in decision making, flattening the discounting curve and biasing choice away from or toward the risky option when it was more or less advantageous, respectively. These findings suggest that PFC D1 and D2 receptors make dissociable, yet complementary, contributions to risk/reward judgments. By striking a fine balance between D1/D2 receptor activity, DA may help refine these judgments, promoting either exploitation of current favorable circumstances or exploration of more profitable ones when conditions change.

Introduction

Aberrations of the mesocorticolimbic dopamine (DA) system have been linked to profound deficits in decision making associated with certain psychiatric diseases. These include individuals with schizophrenia (Hutton et al., 2002), Parkinson’s disease (Pagonabarraga et al., 2007), and stimulant addiction (Rogers et al., 1999). Animal models of decision making have revealed that manipulations of DA transmission can profoundly alter choices between small, easy to obtain rewards and large, yet more costly rewards. Systemic blockade of D1 or D2 receptors reduces the preference to wait longer or work harder to obtain a larger reward, whereas increasing DA transmission exerts differential effects on effort- or delay-based decision making, increasing or decreasing preference for larger rewards that come with a greater cost (Cousins et al., 1994; Cardinal et al., 2000; Denk et al., 2005; van Gaalen et al., 2006; Floresco et al., 2008a; Bardgett et al., 2009). Similarly, when rats choose between small, certain and large, yet risky rewards on a probabilistic discounting task, systemic administration of D1 or D2 antagonists reduces preference for large, risky options (St. Onge and Floresco, 2009). Conversely, D1 or D2 agonists bias choice toward large, risky options. However, given that numerous brain regions have been implicated in risk/reward judgments (e.g., frontal lobes, ventral striatum, amygdala) (Floresco et al., 2008b), the terminal regions on which DA may be acting to influence these processes remains unclear.

DA modulates multiple cognitive functions mediated by different regions of the prefrontal cortex (PFC), such as behavioral flexibility, working memory, and attentional processes (Williams and Goldman-Rakic, 1995; Granon et al., 2000; Chudasama and Robbins, 2004; Floresco et al., 2006), often in an “inverted U”-shaped curve, where too little or too much DA activity impairs certain executive functions. However, there have been comparatively few studies investigating the contribution of PFC DA transmission to different forms of cost/benefit decision making. Reducing DA activity in the anterior cingulate alters effort-based decisions (Schweimer et al., 2005; Schweimer and Hauber, 2006), whereas blockade or stimulation of medial PFC D1 receptors reduces preference for larger, delayed rewards (Loos et al., 2010). Notably, there have been no studies investigating the contribution of different PFC DA receptors to risk-based decision making.

Recent work has identified the prelimbic medial PFC as a critical region in the mediation of probabilistic discounting, whereas activity in other subregions (anterior cingulate, orbitofrontal, insular) do not appear to contribute to this behavior (St. Onge and Floresco, 2010). Inactivation of the medial PFC increased preference for larger, probabilistic rewards when the odds of obtaining them decreased over a session, but decreased choice when reward probabilities increased over a session. The results of this study led us to conclude that the medial PFC serves to integrate information about changing reward probabilities to update value representations that facilitate more efficient decision making. Given the critical role that mesocortical DA plays in other forms of cognition (Floresco and Magyar, 2006), the present study investigated the contribution of prefrontal D1/D2 receptor activity to risk-based decision making using a probabilistic discounting task.

Materials and Methods

Animals.

Male Long–Evans rats (Charles River Laboratories) weighing 275–300 g at the beginning of behavioral training were used for the experiment. Upon arrival, rats were given 1 week to acclimatize to the colony and food was restricted to 85–90% of their free-feeding weight for an additional week before behavioral training. Rats were given ad libitum access to water for the duration of the experiment. Feeding occurred in the rats’ home cages at the end of the experimental day, and body weights were monitored daily to ensure a steady weight loss during food restriction and maintenance or weight gain for the rest of the experiment. All testing was in accordance with the Canadian Council of Animal Care and the Animal Care Committee of the University of British Columbia.

Apparatus.

Behavioral testing was conducted in 12 operant chambers (30.5 × 24 × 21 cm; Med Associates) enclosed in sound-attenuating boxes, each equipped with a fan to provide ventilation and to mask extraneous noise. Each chamber was fitted with two retractable levers, one located on each side of a central food receptacle where food reinforcement (45 mg; Bio-Serv) was delivered via a pellet dispenser. The chambers were illuminated by a single 100 mA house light located in the top-center of the wall opposite the levers. Four infrared photobeams were mounted on the sides of each chamber. Locomotor activity was indexed by the number of photobeam breaks that occurred during a session. All experimental data were recorded by an IBM personal computer connected to the chambers via an interface.

Lever-pressing training.

Our initial training protocols were identical to those of St. Onge and Floresco (2009), as adapted from Cardinal et al. (2000). On the day before their first exposure to the chambers, rats were given ∼25 sugar reward pellets in their home cage. On the first day of training, 2–3 pellets were delivered into the food cup and crushed pellets were placed on a lever before the animal was placed in the chamber. Rats were first trained under a fixed-ratio 1 schedule to a criterion of 60 presses in 30 min, first for one lever, and then repeated for the other lever (counterbalanced left/right between subjects). Rats were then trained on a simplified version of the full task. These 90 trial sessions began with the levers retracted and the operant chamber in darkness. Every 40 s, a trial was initiated with the illumination of the house light and the insertion of one of the two levers into the chamber. If the rat failed to respond on the lever within 10 s, the lever was retracted, the chamber darkened, and the trial was scored as an omission. If the rat responded within 10 s, the lever retracted and a single pellet was delivered with 50% probability. This procedure was used to familiarize the rats with the probabilistic nature of the full task. In every pair of trials, the left or right lever was presented once, and the order within the pair of trials was random. Rats were trained for ∼5–6 d to a criterion of 80 or more successful trials (i.e.; ≤10 omissions).

Probabilistic discounting task.

The primary task used in these studies has been described previously (Floresco and Whelan, 2009; Ghods-Sharifi et al., 2009; St. Onge and Floresco, 2009, 2010; St. Onge et al., 2010), and was originally modified from that described by Cardinal and Howes (2005) (Fig. 1). Briefly, rats received daily sessions consisting of 72 trials, separated into 4 blocks of 18 trials. The entire session took 48 min to complete, and animals were trained 6–7 d per week. A session began in darkness with both levers retracted (the intertrial state). A trial began every 40 s with the illumination of the house light and, 3 s later, insertion of one or both levers into the chamber (the format of a single trial is shown in Fig. 1). One lever was designated the large/risky lever, the other the small/certain lever, which remained consistent throughout training (counterbalanced left/right). If the rat did not respond by pressing a lever within 10 s of lever presentation, the chamber was reset to the intertrial state until the next trial (omission). When a lever was chosen, both levers retracted. Choice of the small/certain lever always delivered one pellet with 100% probability; choice of the large/risky lever delivered 4 pellets but with a particular probability. When food was delivered, the house light remained on for another 4 s after a response was made, after which the chamber reverted back to the intertrial state. Multiple pellets were delivered 0.5 s apart. The 4 blocks were comprised of 8 forced-choice trials in which only one lever was presented (4 trials for each lever, randomized in pairs), permitting animals to learn about the relative likelihood of receiving the larger or smaller reward in each block. This was followed by 10 free-choice trials, in which both levers were presented and the animal chose either the small/certain or the large/risky lever. The probability of obtaining 4 pellets after pressing the large/risky lever varied across blocks: it was initially 100%, then 50%, 25%, and 12.5%, respectively, for each successive block. The probability of receiving the large reward on each trial was drawn from a set probability distribution. Using these probabilities, selection of the large/risky lever would be advantageous in the first two blocks, and disadvantageous in the last block, whereas rats could obtain an equivalent number of food pellets after responding on either lever during the 25% block. Therefore, in the last three trial blocks of this task, selection of the larger reward option carries with it an inherent “risk” of not obtaining any reward on a given trial. Latencies to initiate a choice and overall locomotor activity (photobeam breaks) were also recorded. Rats were trained on the task until, as a group, they (1) chose the large/risky lever during the first trial block (100% probability) on at least 80% of successful trials, and (2) demonstrated stable baseline levels of choice, assessed using a procedure similar to that described by Winstanley et al. (2005) and St. Onge and Floresco (2009). In brief, data from three consecutive sessions were analyzed with repeated-measures ANOVA with two within-subject factors (day and trial block). If the effect of block was significant at the p < 0.05 level but there was no main effect of day or day × trial block interaction (at p > 0.1 level), animals were judged to have achieved stable baseline levels of choice behavior.

Figure 1.

Task design. Cost/benefit contingencies associated with responding on either lever (A) and format of a single free-choice trial (B) on the probabilistic discounting task.

Reward magnitude discrimination task.

As we have done previously (Ghods-Sharifi et al., 2009; Stopper and Floresco, 2011), we determined a priori that if a particular treatment specifically decreased preference for the large/risky lever on the probabilistic discounting task, separate groups of animals would be trained and tested on a reward magnitude discrimination task to determine whether this effect was due to an impairment in discriminating between reward magnitudes associated with the two levers. In these experiments, rats were trained to press retractable levers as in the probabilistic discounting task, after which they were trained on the discrimination task. Here, rats chose between one lever that delivered one pellet and another that delivered four pellets. Both the small and large rewards were delivered immediately after a single response with 100% probability. A session consisted of four blocks of trials, with each block consisting of 2 forced-choice followed by 10 free-choice trials.

Surgery.

Rats were subjected to surgery once the group displayed stable patterns of choice for 3 consecutive days. After the stability criterion was achieved, rats were provided food ad libitum and, 2 d later, underwent stereotaxic surgery. Rats were anesthetized with 100 mg/kg ketamine hydrochloride and 7 mg/kg xylazine and subsequently implanted with bilateral 23 gauge stainless steel guide cannulae into the prelimbic region of the medial PFC (flat skull; anteroposterior, +3.4 mm; medial-lateral, ±0.7 mm from bregma; and dorsoventral, −2.8 mm from dura). Thirty gauge obdurators, flush with the end of guide cannulae, remained in place until the infusions were made. Rats were given at least 7 d to recover from surgery before testing. During this recovery period, animals were handled for at least 5 min each day and food was restricted to 85% of their free-feeding weight. Body weights were continuously monitored on a daily basis to ensure a steady weight loss during this recovery period.

Microinfusion protocol.

Following recovery from surgery, rats were subsequently retrained on either the probabilistic discounting or reward magnitude discrimination task for at least 5 d and until, as a group, they displayed stable levels of choice behavior. For 3 d before the first microinfusion test day, obdurators were removed and a mock infusion procedure was administered. Stainless steel injectors were placed in the guide cannulae for 2 min, but no infusion took place. This procedure habituated rats to the routine of infusions to reduce stress on subsequent test days. The day after displaying stable discounting, the group received its first microinfusion test day.

A within-subjects design was used for all experiments. The following drugs were used: the D1 antagonist R-(+)-SCH23390 hydrochloride (1.0 μg, 0.1 μg; Sigma-Aldrich), the D2 antagonist eticlopride hydrochloride (1.0 μg, 0.1 μg; Sigma-Aldrich), the D1 receptor agonist SKF81297 (0.4 μg, 0.1 μg; Tocris Bioscience), and the D2 agonist quinpirole (10 μg, 1 μg; Sigma-Aldrich). All drugs were dissolved in physiological 0.9% saline, sonicated until dissolved, and protected from light. The selected doses have all been well documented by both our group and others to be behaviorally active when given intracerebrally (Seamans et al., 1998; Ragozzino, 2002; Chudasama and Robbins, 2004; Floresco and Magyar, 2006; Floresco et al., 2006; Haluk and Floresco, 2009; Loos et al., 2010).

Infusions of the D1 and D2 antagonists, agonists, and saline were administered bilaterally into the medial PFC via a microsyringe pump connected to PE tubing and 30 gauge cannulae that protruded 0.8 mm past the end of the guide, at a rate of 0.5 μl/75 s. Injection cannulae were left in place for an additional 1 min to allow for diffusion. Each rat remained in its home cage for another 10 min period before behavioral testing.

Four separate groups of rats were used to test the effects of each of the four compounds (D1 antagonist, D2 antagonist, D1 agonist, D2 agonist). The order of treatments (saline, low dose, high dose) was counterbalanced across rats within a particular treatment group. Following the first infusion test day, rats received a baseline training day (no infusion). If, for any individual rat, choice of the large/risky lever on this day deviated by >15% from its preinfusion baseline, the rat received an additional day of training before the second infusion test. On the next day, rats received a second counterbalanced infusion, followed by another baseline day, and finally the last infusion.

Histology.

After completion of all behavioral testing, rats were killed in a carbon dioxide chamber. Brains were removed and fixed in a 4% formalin solution. The brains were frozen and sliced in 50 μm sections before being mounted and stained with cresyl violet. Placements were verified with reference to the neuroanatomical atlas of Paxinos and Watson (1998). The locations of acceptable infusions in the medial PFC are presented in the right panels of Figure 2.

Figure 2.

Histology. Schematic of coronal sections of the rat brain showing the range of acceptable locations of infusions through the rostral-caudal extent of the medial PFC for all rats.

Data analysis.

The primary dependent measure of interest was the percentage of choices directed toward the large/risky lever for each block of free-choice trials, factoring in trial omissions. For each block, this was calculated by dividing the number of choices of the large/risky lever by the total number of successful trials. The choice data for each drug group were analyzed using two-way within-subject ANOVAs, with treatment (saline, low dose, high dose) and trial block (100, 50, 25, 12.5%) as the within-subject factors. The main effect of block for the choice data was significant in all discounting experiments (p < 0.05), indicating that rats discounted choice of the large/risky lever as the probability of the large reward changed across the four blocks. This effect will not be mentioned further. Response latencies, locomotor activity (photobeam breaks), and the number of trial omissions were analyzed with one-way ANOVAs.

Previous SectionNext Section

Results

Four groups of animals were initially trained in separate experiments and allocated to one of the four drug groups. The first two groups of 16 each, designated for D1 and D2 antagonist experiments, required an average of 28 d of training before reaching stable choice performance and receiving counterbalanced microinfusion tests. The second two groups of 14 and 14 rats for the D1 and D2 agonists required an average of 34 d of training before reaching stable choice performance. Response latency, locomotor, and trial omission data obtained on test days for all four groups are presented in Table 1.

Table 1.

Locomotion, trial omission, and response latency data obtained following saline or drug infusions into the medial PFC

D1 and D2 receptor antagonism and probabilistic discounting

D1 blockade

Initially, 16 rats were trained for this experiment. One animal died during surgery and the data from three others were eliminated due to inaccurate placements, resulting in a final n = 12. Analysis of the choice data revealed that intra-PFC infusions of the D1 antagonist SCH23390 resulted in a significant main effect of treatment (F(2,22) = 3.26, p = 0.05) but no treatment × block interaction (F(6,66) = 0.92, n.s.). The high dose of SCH23390 (1 μg) significantly decreased preference for the large/risky lever in the latter three blocks (p < 0.05; Fig. 3A), whereas the low dose (0.1 μg) produced no reliable change in choice behavior. D1 blockade had no effect on response latencies (F(2,22) = 0.18, n.s.), trial omissions (F(2,22) = 0.54, n.s.), or locomotor counts (F(2,22) = 1.66, n.s.).

Figure 3.

Effects of DA receptor manipulations in the medial PFC on probabilistic discounting. Data are plotted in terms of percentage choice of the large/risky lever during free-choice trials by probability block (x-axis). Symbols represent mean + SEM. Gray stars denote a significant main effect (saline vs high dose, p < 0.05). Black stars denote a significant difference (p < 0.05) between treatment conditions during a particular probability block main effect. A, Infusions of the 1.0 μg dose of D1 antagonist SCH23390 accelerated probabilistic discounting, reducing risky choice. B, In contrast, infusions of the 1.0 μg dose of the D2 antagonist eticlopride retarded discounting and increased risky choice. C, The D1 agonist SKF81297 induced a slight, nonsignificant increase in risky choice. D, Infusions of the 10 μg dose of the D2 agonist quinpirole abolished discounting, decreasing risky choice during the initial block and increasing choice during the final block.

D2 blockade

Initially, 16 rats were trained for this experiment. One animal died during surgery and the data from three others were eliminated due to inaccurate placements, resulting in a final n = 12. Analysis of the choice data also revealed a significant main effect of treatment (F(2,22) = 3.76, p < 0.05) but no treatment × block interaction (F(6,66) = 0.84, n.s.). However, in contrast to the effects of D1 receptor blockade, the high dose of eticlopride (1 μg) significantly increased preference for the large/risky lever across all blocks (p < 0.05; Fig. 3B), with the low dose (0.1 μg) producing a slight, but nonsignificant increase in choice. Eticlopride had no effect on response latencies (F(2,22) = 0.63, n.s.), trial omissions (F(2,22) = 1.45, n.s.), or locomotor counts (F(2,22) = 0.99, n.s.). Thus, blockade of D1 or D2 receptors in the medial PFC had qualitatively opposite effects on probabilistic discounting. Reducing D1 receptor activity increased discounting of larger, uncertain rewards, whereas D2 receptor antagonism reduced discounting, reflected as apparent decreases and increases in risky choice, respectively.

D1 and D2 receptor stimulation and probabilistic discounting

D1 stimulation

Initially, 14 rats were trained for this experiment. One animal died during surgery and the data from one rat were excluded because his baseline choice data were 2 SDs below the mean of the rest of the group, resulting in a final n = 12. Following administration of the D1 agonist SKF81297 into the medial PFC, rats tended to show an effect opposite to that induced by the D1 antagonist, displaying a moderate increase in preference for the large/risky lever, with this effect being numerically greater after treatment with the lower, 0.1 μg dose. Despite this tendency, analysis of the choice data did not reveal a significant effect of treatment (F(2,22) = 2.05, n.s.) or treatment × block interaction (F(6,66) = 0.10, n.s.; Fig. 3C), although a direct comparison between the low-dose and saline treatment conditions did show a trend toward statistical significance (p = 0.086). The D1 agonist also had no effect on response latencies (F(2,22) = 0.67, n.s.), trial omissions (F(2,22) = 0.06, n.s.), or locomotor counts (F(2,22) = 0.36, n.s.).

D2 stimulation

Again, 14 rats were trained for this experiment. The data from one rat were excluded because his baseline choice data showed no prominent discounting after the 34 d of training, while the data pertaining to another rat were eliminated due to an inaccurate placement, resulting in a final n = 12 in this group. Treatment with the D2 agonist quinpirole induced an effect on choice that was unique when compared with that induced by either DA receptor antagonist or the D1 agonist. Analysis of the choice data revealed no significant main effect of treatment (F(2,22) = 0.05, n.s.), but there was a significant treatment × block interaction (F(6,66) = 2.33, p < 0.05, Dunnett’s p < 0.05). Simple main effects analyses further showed that, whereas the low dose (1 μg) of quinpirole had no effect on choice, the high dose (10 μg) produced a pronounced “flattening” of the discounting curve. Specifically, this dose significantly (p < 0.05) decreased choice of the large/risky lever in the initial 100% block, but significantly increased risky choice during the last block (12.5%) relative to saline infusions (Fig. 3D). Moreover, following infusions of either saline or the 1.0 μg dose of quinpirole, rats showed significant discounting of the large/risky option as the odds of obtaining the larger reward decreased over a session (p < 0.005). In contrast, the proportion of choice of this option did not significantly change across the four blocks after treatment with 10 μg of quinpirole (p > 0.25). Quinpirole had no effect on trial omissions (F(2,22) = 0.84, n.s.) or locomotor counts (F(2,22) = 1.72, n.s.), although the high dose significantly increased choice latencies across the four blocks (F(2,22) = 3.54, p < 0.05 and Dunnett’s, p < 0.05; Table 1).

Win-stay/lose-shift analysis

Infusions of selective D1 or D2 receptor agonists or antagonist into the medial PFC each induced distinct effects on decision making. To obtain further insight into how these treatments affected patterns of choice and resulting alterations in discounting, we conducted a supplementary analysis of the choice data. Specifically, we conducted a choice-by-choice analysis to identify whether changes in behavior were due to alterations in the likelihood of choosing the risky lever after obtaining the larger reward (win-stay performance) or alterations in negative feedback sensitivity (lose-shift performance) (Bari et al., 2009; Stopper and Floresco, 2011). Animals’ choices during the task were analyzed according to the outcome of each preceding free-choice trial (reward or non-reward) and expressed as a ratio. The proportion of win-stay trials was calculated from the number of times the rat chose the large/risky lever after choosing the risky option on the preceding trial and obtaining the large reward (a win), divided by the total number of free-choice trials in which the rat obtained the larger reward. Conversely, lose-shift performance was calculated from the number of times rats shifted choice to the small/certain lever after choosing the risky option on the preceding trial and were not rewarded (a loss), divided by the total number of free-choice trials resulting in a loss.

Because of the probabilistic nature of the task, across the four experiments there were at least 4–5 instances where an individual animal either did not select the large/risky lever (and therefore, could not “stay” or “shift” after a win or loss) or did not obtain the large reward at all during a certain probability block (particularly the latter two blocks). Thus, in either of these cases, the denominator in the equation used to compute these ratios would be zero for at least one of the blocks, which precluded us from conducting a block-by block analysis of these data. To overcome this, an analysis was conducted for all trials across the four blocks, as we have done previously (Stopper and Floresco, 2011). Changes in win-stay performance were used as a general index of the impact that obtaining the large, risky reward had on subsequent choice behavior, whereas changes in lose-shift performance served as an index of negative feedback sensitivity over the entire duration of the test session.

Given that each of the four compounds induced distinct effects on choice behavior, we were particularly interested in directly comparing the effects of each compound relative to saline treatment. For this analysis, we used data obtained following treatment with the most effective doses of each drug and corresponding vehicle injections (for SKF81297, we used data obtained after treatment with the lower, 0.1 μg dose). Analysis of win-stay and lose-shift trials revealed a significant four-way interaction of trial type (win-stay vs lose-shift) × treatment (saline vs drug) × receptor (D1 vs D2) × drug type (antagonist vs agonist) (F(1,44) = 11.92, p < 0.05; Fig. 4, Table 2). As was observed with analysis of overall choice behavior, this four-way interaction was driven by the fact that each drug induced a distinct effect on win-stay/lose-shift tendencies. With respect to win-stay performance, under control conditions, rats displayed a strong tendency (between 80 and 90%) to select the risky lever after selecting this lever on the preceding trial and receiving reward, as we have observed previously (Stopper and Floresco, 2011). Conversely, animals tended to shift to the small/certain lever following a “loss” after choosing the large/risky lever on ∼25–30% of these trials under control conditions.

Figure 4.

Effects of PFC DA receptor manipulations on win-stay (gray bars) and lose-shift (white bars) tendencies. For clarity and comparative purposes, the data are presented here in terms of a difference score between the ratios obtained on drug versus saline treatments (positive values indicate an increased ratio, negative values a decrease after drug treatment relative to control infusions). Raw data used in the overall analysis from which these values were obtained are presented in Table 2. Win-stay ratios index the proportion of trials for which rats chose the large/risky lever after receiving the larger reward on the previous trial. Lose-shift ratios index the proportion of trials for which rats shifted choice to the small/certain lever following unrewarded choice of the large/risky lever. Stars denote a significant difference from saline at the 0.05 level. n.s., not significant.

Table 2.

Win-stay/lose-shift ratios for rats performing the probabilistic discounting task following infusion of saline and the highest or most effective dose of D1 and D2 antagonist or agonists

Simple main effects analysis of the four-way interaction revealed that the D1 antagonist SCH23390 did not affect win-stay performance but did significantly increase lose-shift tendencies (Dunnett’s, p < 0.05), suggesting that the decrease in risky choice induced by these treatments may be attributable in part to increased sensitivity to negative feedback (i.e; reward omission). In contrast, D2 blockade with eticlopride (1 μg) significantly increased the probability of choosing the risky option following a “win” (p < 0.05), while causing a nonsignificant decrease in lose-shift tendencies. Thus, the increase in risky choice induced by D2 blockade appears to be attributable primarily to an enhanced impact of obtaining a large reward on subsequent choice.

The D1 agonist SKF81297 (0.1 μg) significantly increased win-stay performance versus saline (p < 0.05), but also had the opposite effect of SCH23390, reducing the tendency to shift after a loss from the large/risky lever (p < 0.05). In contrast, quinpirole (10 μg) had the opposite effect of the D1 agonist on win-stay tendencies, significantly decreasing the probability of choosing the large/risky lever after a “win” (p < 0.05), suggesting a reduced sensitivity to receipt of larger, yet uncertain rewards. This treatment had no significant effect on lose-stay ratios. These findings indicate that D1 vs D2 receptor modulation induces differential changes in choice performance that appear to be characterized by distinct changes in the impact of either obtaining the larger reward or negative feedback sensitivity.

Reward magnitude discrimination

Blockade of D1 receptors or stimulation of D2 receptors reduced preference for the larger, uncertain reward during certain trial blocks of the discounting task. To assess whether these effects were attributable to a general disruption in discriminating between rewards of different magnitudes, we conducted another experiment, wherein two separate groups of rats were trained on a simpler task. Rats chose between two levers that delivered either one or four pellets, both with 100% probability. Fifteen rats were trained for 11 d on this task before receiving counterbalanced microinfusions of the high dose of SCH23390 (1 μg) or quinpirole (10 μg) and saline. The data for one animal were removed due to an inaccurate placement, leaving a final n of 6 in the SCH23390 group and 8 in the quinpirole group.

D1 blockade

Following saline infusions, rats displayed a very strong bias toward the larger reward, selecting this option on nearly 100% of the trials (Fig. 5A). Following infusions of SCH23390 (1 μg), there was no change in preference toward the four-pellet option (F(1,5) = 1.72, n.s.). In contrast to choice, we did see a slight increase in response latencies following D1 blockade (saline = 0.81 ± 0.1 s, SCH23390 = 0.98 ± 0.1 s; F(1,5) = 7.18, p < 0.05). Locomotor activity (F(1,5) = 4.86, n.s.) and trial omissions (F(1,5) = 1.0, n.s.) were unaffected by SCH23390. Thus, even though infusions of this dose of SCH23390 reduced choice of the larger reward option during the probabilistic discounting task, this effect does not appear to be attributable to a general reduction in the subjective value of larger rewards.

Figure 5.

Effects of DA receptor modulation in the medial PFC on reward magnitude discrimination. Rats were trained to choose between two levers that delivered either a four- or one-pellet reward immediately after a single press with 100% probability. A, D1 blockade (SCH23390, 1 μg) did not significantly disrupt the preference for the larger four-pellet reward during free-choice trials relative to saline treatment. B, D2 receptor stimulation (quinpirole, 10 μg) also did not alter preference for the large reward.

D2 receptor stimulation

A similar profile of choice was observed for the rats receiving the high dose (10 μg) of quinpirole into the medial PFC. Again, rats selected the four-pellet option on almost all of the free-choice trials after saline infusions. This preference was not altered by stimulation of D2 receptors (F(1,6) = 0.53, n.s.; Fig. 5B). Quinpirole also had no significant effect on latencies, locomotion, or omissions (all F values <1.76, n.s.). Note that similar treatments did reduce choice of the larger reward on the probabilistic discounting task during the first, 100% probability block (Fig. 3B). A possible explanation for this difference is that, unlike rats trained on the reward magnitude discrimination, those trained on the discounting task had learned that the relative utility of the large/risky option decreases over a session. Thus, their representation of the relative value of the large reward option would be expected to be more labile than that of rats trained on the simpler task and, therefore, more susceptible to disruption. Collectively, the results of this experiment show that even though blockade of D1 receptors and stimulation of D2 receptors substantially alters choices between small, certain and large, probabilistic rewards, these effects do not appear to be attributable to more fundamental impairments in the ability to discriminate between larger and smaller rewards.

Discussion

Here we report that D1 and D2 receptors in the medial PFC exert a critical influence over choices between probabilistic versus certain rewards. Furthermore, decreasing or increasing activity of each of these receptors produced differing, and sometimes opposite, changes in choice, suggesting that they each exert distinct, yet complementary modulatory control over these decision-making processes.

Effects of D1/D2 receptor blockade

To our knowledge, this is the first demonstration that blockade of D1 or D2 receptor in the medial PFC induces opposing effects on behavior. Previous studies of this kind have revealed either that D1, but not D2, antagonism disrupts functions such as attention or working memory (Williams and Goldman-Rakic, 1995; Seamans et al., 1998; Granon et al., 2000) or that both receptors act cooperatively to facilitate set-shifting or bias behavior away from conditioned punishers (Ragozzino, 2002; Floresco and Magyar, 2006). Our findings that SCH23390 and eticlopride induced opposite effects on choice suggest that normal decision making is dependent on a critical balance of frontal lobe D1 and D2 receptor activity, and that altering this balance induces dissociable changes in choice of certain/uncertain rewards.

PFC D1 blockade decreased preference for the large/risky option in a dose-dependent manner, most prominently during the last three probability blocks. SCH23390 increased probabilistic discounting, resembling the effects of this compound when administered systemically (St. Onge and Floresco, 2009). Interestingly, reducing DA transmission in human subjects via tyrosine depletion also leads to more conservative and poorer quality decision making on the Cambridge Gambling Task (McLean et al., 2004). Our results suggest that these effects may be mediated in part by reduced prefrontal D1 activation. Choice-by-choice analysis further revealed that this reduced preference for the risky option was linked to an increased tendency to choose the small/certain option following a non-rewarded risky choice, suggesting that the effects on decision making may be the result of increased sensitivity to negative feedback. In a similar vein, blockade of D1 receptors in the prelimbic or anterior cingulate reduces preference for larger rewards when they are either delayed (Loos et al., 2010) or associated with a greater effort cost (Schweimer and Hauber, 2006). Collectively, these findings suggest that PFC D1 signaling exerts a profound influence on cost/benefit evaluations, facilitating the ability to overcome costs that may be associated with larger rewards in an effort to maximize long-term gains.

In stark contrast, PFC D2 receptor blockade increased preference for the large/risky option, slowing the shift in choice bias as reward probabilities decreased over a session. Notably, this effect resembles that induced by PFC inactivation under similar task conditions (St. Onge and Floresco, 2010). However, we do not believe this reflects a general increase in “risky” behavior per se. Rather, our previous findings led us to conclude that the medial PFC plays a critical role in monitoring changes in reward probabilities to adjust behavior accordingly. The present results expand on this, revealing that D2 receptors make an essential contribution to PFC regulation of this aspect of decision making. This apparent increase in risky choice was driven more prominently by an increased tendency to select the risky option after obtaining a large reward on the preceding trial. Thus, rather than integrating information about the likelihood of obtaining the larger reward across multiple trials, D2 blockade caused receipt of the larger reward to exert a greater and more immediate impact on the direction of subsequent choice. This is in keeping with a recent study in humans, in which D2 antagonism increased both choice of options associated with higher reward probabilities and corresponding changes in ventromedial PFC activity (Jocham et al., 2011). Collectively, these findings show that PFC D1 and D2 receptors form distinct, yet complementary contributions to decision making. D1 receptor activity promotes choice of larger, yet uncertain or more costly rewards, whereas D2 receptors mitigate the immediate impact that larger, probabilistic rewards exert over choice bias, facilitating the ability to adjust behavior over the long-term when the likelihood of obtaining these rewards changes.

Effects of D1/D2 receptor stimulation

Intra-PFC infusions of D1 receptor agonist SKF81297, within dose ranges that have been shown to exert differential effects on other forms of cognition (attention, working memory), did not significantly alter risky choice, although these treatments slightly increased preference for the large/risky lever, most prominently with the low dose. Interpretation of this null effect should be approached with caution, as these non-monotonic dose/response effects suggest that SKF81297 may have an effective dose range that is narrower than it may be for other cognitive functions. Moreover, the 0.1 μg dose did significantly alter choice patterns, increasing win-stay performance and decreasing lose-shift tendencies, where rats were more likely to choose the large/risky lever following both rewards and reward omissions. Nevertheless, the fact that increasing doses of SKF81297 did not significantly alter choice indicates that supranormal stimulation of PFC D1 receptors does not substantially interfere with decision making about risks and rewards. In contrast, similar treatments decrease choice of larger, delayed rewards (Loos et al., 2010), providing further support that different types of cost/benefit decision making can be dissociated pharmacologically.

The D2 agonist quinpirole induced a true “impairment” in decision making, markedly flattening the discounting curve, with rats displaying no discernable discounting upon changes in reward probabilities. Choice of the four-pellet option was reduced in the 100% block (when it was most advantageous), but increased in the 12.5% block (when it is least advantageous). Following D2 stimulation, the overall proportion of large/risky choices did not change relative to saline (∼73%), but animals were completely insensitive to changes in these probabilities. Thus, excessive D2 receptor activation severely interfered with the ability to adjust choice, apparently causing rats to use a simpler alternation strategy across blocks while maintaining a bias toward the large/risky lever. This finding, in combination with the effects of eticlopride, suggests that the relative levels of D2 (rather than D1) receptor tone in the medial PFC has a critical impact on this aspect of decision making, and either increasing or decreasing this activity can interfere with performance.

The disadvantageous choice pattern produced by quinpirole bears a striking resemblance to that induced by reducing motivation for food through long-term free-feeding (St. Onge and Floresco, 2009). These complementary findings make it tempting to speculate that they may be related phenomena. Indeed, changes in medial PFC DA efflux have been proposed to reflect a generalized food reward or incentive motivational signal (Ahn and Phillips, 1999; Winstanley et al., 2006). Thus, changes in the amount of reward obtained over time may be signaled to the PFC by corresponding fluctuations in mesocortical DA levels that, via actions on D2 receptors, may be used to detect changes in the amount of reward obtained over time and facilitate alterations in choice bias. It follows that flooding D2 receptors may disrupt this dynamic signal, which could ultimately produce more static patterns of choice.

Dissociable contributions of PFC D1 and D2 receptors to risk-based decision making

The question remains as to why blockade of D1 or D2 receptors should exert opposing effects on risky choice, given that endogenous DA activates both receptors. Contemporary theory on how these receptors differentially affect PFC neural network activity may provide insight into this issue (Durstewitz et al., 2000; Seamans and Yang, 2004). D1 receptors have been proposed to decrease the influence of weak inputs, stabilizing network activity so that a single representation dominates PFC output. Conversely, D2 activity attenuates inhibitory influences, allowing PFC neural ensembles to process multiple stimuli/representations, placing theses networks in a more labile state that may permit changes in representations.

During different phases of the probabilistic discounting task used here, animals at some points must either maintain (within a probability block) or modify (across blocks) their representation of the relative value of the large/risky option. Thus, the opposing effects of D1/D2 antagonism described here may reflect differential contributions of these receptors during distinct phases of the task. D1 activity may stabilize the representation of the relative long-term value of the risky option within a particular block, maintaining choice bias even when a risky choice leads to reward omission (keeping the “eye on the prize”). Blocking these receptors would make animals more sensitive to reward omissions (i.e., increasing lose-shift tendencies), and reduce risky choice. Conversely, as the large/risky option yields fewer rewards across blocks, D2 receptors (possibly on a different neuronal population) may facilitate modifications in value representations. As such, reducing their activity would disrupt the updating of these representations and corresponding changes in choice bias. This model may also partially account for the effects of increasing D1 and D2 receptor activity, which would be expected to lead to either more persistent choice of the large/risky option or induce a “hyperflexible” state, respectively. Thus, our findings suggest that PFC DA tone makes a critical and complex contribution to risk/reward judgments. By striking a fine balance between D1/D2 receptor activity, mesocortical DA may help refine cost/benefit decisions between options of varying magnitude and uncertainty, promoting either exploitation of current favorable circumstances or exploration of more profitable ones when conditions change.

Footnotes

This work was supported by a grant from the Canadian Institutes of Health Research (MOP 89861) to S.B.F. S.B.F. is a Michael Smith Foundation for Health Research Senior Scholar and J.R.S.O. is the recipient of scholarships from the Natural Sciences and Engineering Research Council of Canada and the Michael Smith Foundation for Health Research.

Correspondence should be addressed to Dr. Stan B. Floresco, Department of Psychology and Brain Research Center, University of British Columbia, 2136 West Mall, Vancouver, BC V6T 1Z4, [email protected]

Copyright © 2011 the authors 0270-6474/11/318625-09$15.00/0

References

1. ↵

1. Ahn S,

2. Phillips AG

(1999) Dopaminergic correlates of sensory-specific satiety in the medial prefrontal cortex and nucleus accumbens of the rat. J Neurosci 19:RC29, (1–6).

Abstract/FREE Full Text

2. ↵

1. Bardgett ME,

2. Depenbrock M,

3. Downs N,

4. Points M,

5. Green L

(2009) Dopamine modulates effort-based decision-making in rats. Behav Neurosci 123:242–251.

CrossRefMedline

3. ↵

1. Bari A,

2. Eagle DM,

3. Mar AC,

4. Robinson ES,

5. Robbins TW

(2009) Dissociable effects of noradrenaline, dopamine, and serotonin uptake blockade on stop task performance in rats. Psychopharmacology 205:273–283.

CrossRefMedline

4. ↵

1. Cardinal RN,

2. Howes NJ

(2005) Effects of lesions of the nucleus accumbens core on choice between small certain rewards and large uncertain rewards in rats. BMC Neurosci 6:37.

CrossRefMedline

5. ↵

1. Cardinal RN,

2. Robbins TW,

3. Everitt BJ

(2000) The effects of d-amphetamine, chlordiazepoxide, alpha-flupenthixol and behavioural manipulations on choice of signalled and unsignalled delayed reinforcement in rats. Psychopharmacology 152:362–375.

CrossRefMedline

6. ↵

1. Chudasama Y,

2. Robbins TW

(2004) Dopaminergic modulation of visual attention and working memory in the rodent prefrontal cortex. Neuropsychopharmacology 29:1628–1636.

CrossRefMedline

7. ↵

1. Cousins MS,

2. Wei W,

3. Salamone JD

(1994) Pharmacological characterization of performance on a concurrent lever pressing/feeding choice procedure: effects of dopamine antagonist, cholinomimetic, sedative and stimulant drugs. Psychopharmacology 116:529–537.

CrossRefMedline

8. ↵

1. Denk F,

2. Walton ME,

3. Jennings KA,

4. Sharp T,

5. Rushworth MF,

6. Bannerman DM

(2005) Differential involvement of serotonin and dopamine systems in cost-benefit decisions about delay or effort. Psychopharmacology 179:587–596.

CrossRefMedline

9. ↵

1. Durstewitz D,

2. Seamans JK,

3. Sejnowski TJ

(2000) Neurocomputational models of working memory. Nat Neurosci 3(Suppl):1184–1191.

CrossRefMedline

10. ↵

1. Floresco SB,

2. Magyar O

(2006) Mesocortical dopamine modulation of executive functions: beyond working memory. Psychopharmacology 188:567–585.

CrossRefMedline

11. ↵

1. Floresco SB,

2. Whelan JM

(2009) Perturbations in different forms of cost/benefit decision making induced by repeated amphetamine exposure. Psychopharmacology 205:189–201.

CrossRefMedline

12. ↵

1. Floresco SB,

2. Magyar O,

3. Ghods-Sharifi S,

4. Vexelman C,

5. Tse MT

(2006) Multiple dopamine receptor subtypes in the medial prefrontal cortex of the rat regulate set-shifting. Neuropsychopharmacology 31:297–309.

CrossRefMedline

13. ↵

1. Floresco SB,

2. Tse MT,

3. Ghods-Sharifi S

(2008a) Dopaminergic and glutamatergic regulation of effort and delay-based decision making. Neuropsychopharmacology 33:1966–1979.

CrossRefMedline

14. ↵

1. Floresco SB,

2. St Onge JR,

3. Ghods-Sharifi S,

4. Winstanley CA

(2008b) Cortico-limbic-striatal circuits subserving different forms of cost-benefit decision making. Cogn Affect Behav Neurosci 8:375–389.

CrossRefMedline

15. ↵

1. Ghods-Sharifi S,

2. St Onge JR,

3. Floresco SB

(2009) Fundamental contribution by the basolateral amygdala to different forms of decision making. J Neurosci 29:5251–5259.

Abstract/FREE Full Text

16. ↵

1. Granon S,

2. Passetti F,

3. Thomas KL,

4. Dalley JW,

5. Everitt BJ,

6. Robbins TW

(2000) Enhanced and impaired attentional performance after infusion of D1 dopaminergic receptor agents into rat prefrontal cortex. J Neurosci 20:1208–1215.

Abstract/FREE Full Text

17. ↵

1. Haluk DM,

2. Floresco SB

(2009) Ventral striatal dopamine modulation of different forms of behavioral flexibility. Neuropsychopharmacology 34:2041–2052.

CrossRefMedline

18. ↵

1. Hutton SB,

2. Murphy FC,

3. Joyce EM,

4. Rogers RD,

5. Cuthbert I,

6. Barnes TR,

7. McKenna PJ,

8. Sahakian BJ,

9. Robbins TW

(2002) Decision making deficits in patients with first-episode and chronic schizophrenia. Schizophr Res 55:249–257.

CrossRefMedline

19. ↵

1. Jocham G,

2. Klein TA,

3. Ullsperger M

(2011) Dopamine-mediated reinforcement learning signals in the striatum and ventromedial prefrontal cortex underlie value-based choices. J Neurosci 31:1606–1613.

Abstract/FREE Full Text

20. ↵

1. Loos M,

2. Pattij T,

3. Janssen MC,

4. Counotte DS,

5. Schoffelmeer AN,

6. Smit AB,

7. Spijker S,

8. van Gaalen MM

(2010) Dopamine receptor D1/D5 gene expression in the medial prefrontal cortex predicts impulsive choice in rats. Cereb Cortex 20:1064–1070.

Abstract/FREE Full Text

21. ↵

1. McLean A,

2. Rubinsztein JS,

3. Robbins TW,

4. Sahakian BJ

(2004) The effects of tyrosine depletion in normal healthy volunteers: implications for unipolar depression. Psychopharmacology 171:286–297.

CrossRefMedline

22. ↵

1. Pagonabarraga J,

2. García-Sánchez C,

3. Llebaria G,

4. Pascual-Sedano B,

5. Gironell A,

6. Kulisevsky J

(2007) Controlled study of decision-making and cognitive impairment in Parkinson’s disease. Mov Disord 22:1430–1435.

CrossRefMedline

23. ↵

1. Paxinos G,

2. Watson C

(1998) The rat brain in stereotaxic coordinates (Academic, San Diego), Ed 4.

24. ↵

1. Ragozzino ME

(2002) The effects of dopamine D(1) receptor blockade in the prelimbic-infralimbic areas on behavioral flexibility. Learn Mem 9:18–28.

Abstract/FREE Full Text

25. ↵

1. Rogers RD,

2. Everitt BJ,

3. Baldacchino A,

4. Blackshaw AJ,

5. Swainson R,

6. Wynne K,

7. Baker NB,

8. Hunter J,

9. Carthy T,

10. Booker E,

11. London M,

12. Deakin JF,

13. Sahakian BJ,

14. Robbins TW

(1999) Dissociable deficits in the decision-making cognition of chronic amphetamine abusers, opiate abusers, patients with focal damage to prefrontal cortex, and tryptophan-depleted normal volunteers: evidence for monoaminergic mechanisms. Neuropsychopharmacology 20:322–339.

CrossRefMedline

26. ↵

1. Schweimer J,

2. Hauber W

(2006) Dopamine D1 receptors in the anterior cingulate cortex regulate effort-based decision making. Learn Mem 13:777–782.

Abstract/FREE Full Text

27. ↵

1. Schweimer J,

2. Saft S,

3. Hauber W

(2005) Involvement of catecholamine neurotransmission in the rat anterior cingulate in effort-related decision making. Behav Neurosci 119:1687–1692.

CrossRefMedline

28. ↵

1. Seamans JK,

2. Yang CR

(2004) The principal features and mechanisms of dopamine modulation in the prefrontal cortex. Prog Neurobiol 74:1–58.

CrossRefMedline

29. ↵

1. Seamans JK,

2. Floresco SB,

3. Phillips AG

(1998) D1 receptor modulation of hippocampal-prefrontal cortical circuits integrating spatial memory with executive functions in the rat. J Neurosci 18:1613–1621.

Abstract/FREE Full Text

30. ↵

1. St Onge JR,

2. Floresco SB

(2009) Dopaminergic modulation of risk-based decision making. Neuropsychopharmacology 34:681–697.

CrossRefMedline

31. ↵

1. St Onge JR,

2. Chiu YC,

3. Floresco SB

(2010) Differential effects of dopaminergic manipulations on risky choice. Psychopharmacology 211:209–221.

CrossRefMedline

32. ↵

1. St Onge JR,

2. Floresco SB

(2010) Prefrontal cortical contribution to risk-based decision making. Cereb Cortex 20:1816–1828.

Abstract/FREE Full Text

33. ↵

1. Stopper CM,

2. Floresco SB

(2011) Contributions of the nucleus accumbens and its subregions to different aspects of risk-based decision making. Cogn Affect Behav Neurosci 11:97–112.

CrossRefMedline

34. ↵

1. van Gaalen MM,

2. van Koten R,

3. Schoffelmeer AN,

4. Vanderschuren LJ

(2006) Critical involvement of dopaminergic neurotransmission in impulsive decision making. Biol Psychiatry 60:66–73.

CrossRefMedline

35. ↵

1. Williams GV,

2. Goldman-Rakic PS

(1995) Modulation of memory fields by dopamine D1 receptors in prefrontal cortex. Nature 376:572–575.

CrossRefMedline

36. ↵

1. Winstanley CA,

2. Theobald DE,

3. Dalley JW,

4. Robbins TW

(2005) Interactions between serotonin and dopamine in the control of impulsive choice in rats: therapeutic implications for impulse control disorders. Neuropsychopharmacology 30:669–682.

Medline

37. ↵

1. Winstanley CA,

2. Theobald DE,

3. Dalley JW,

4. Cardinal RN,

5. Robbins TW

(2006) Double dissociation between serotonergic and dopaminergic modulation of medial prefrontal and orbitofrontal cortex of impulsive choice. Cereb Cortex 16:106–114.

Abstract/FREE Full Text