The question of whether (or to what degree) obesity reflects addiction to high energy foods often narrows to the question of whether the overeating of these foods causes the same long-term neuroadaptations as are identified with the late stages of addiction. Of equal or perhaps greater interest is the question of whether common brain mechanisms mediate the acquisition and development of eating and drug-taking habits. The earliest evidence on this question is rooted in early studies of brain stimulation reward. Lateral hypothalamic electrical stimulation can be reinforcing in some conditions and can motivate feeding in others. That stimulation of the same brain region should be both reinforcing and drive-inducing is paradoxical; why should an animal work to induce a drive-like state such as hunger? This is known as the “drive-reward paradox.” Insights into the substrates of the drive-reward paradox suggest an answer to the controversial question of whether the dopamine system—a system “downstream” from the stimulated fibers of the lateral hypothalamus—is more critically involved in “wanting” or in “liking” of various rewards including food and addictive drugs. That the same brain circuitry is implicated in the motivation for and the reinforcement by both food and addictive drugs extends the argument for a common mechanism underlying compulsive overeating and compulsive drug-taking.
In recent years, discussions of addiction have tended to focus on its terminal stages, when repeated exposure to a drug has altered the brain in ways that can be measured by cellular biologists, electrophysiologists, and neuroimagers. In earlier years, the attention was on the habit-forming effects of addictive drugs; how did addictive drugs hijack the brain mechanisms of motivation and reward? The question of whether obesity results from food addiction brings us back to the earlier question of what brain mechanisms are responsible for the development of compulsive foraging for addictive foods and drugs, and this, in turn, brings us back to the problem of parsing the contributions to reward-seeking behaviors of motivation and reinforcement (1).
In large part, the evidence suggesting a common basis for obesity and addiction is evidence implicating brain dopamine in the habit-forming effects of food (2) and of addictive drugs (3). While the dopamine system is activated by food (4) and by most addictive drugs (5), debate remains as to whether the role of dopamine is primarily a role in the reinforcing effects of food and drugs or a role in the motivation to obtain them (6–8); in colloquial terms, is dopamine more essential to the “liking” of a reward or the “wanting” of the reward (9)? A line of relevant evidence not widely considered in recent years is evidence of a phenomenon termed the “drive-reward paradox.” Here I describe the paradox and relate it to the evidence that dopamine has common roles in compulsive food-seeking and compulsive drug-seeking and to the question of which of the roles—motivation or reinforcement—depends on the dopamine system.
Lateral hypothalamic electrical stimulation
In the 1950s, the lateral hypothalamus was labeled a pleasure center by some (10) and a hunger center by others (11). Electrical stimulation of this region was rewarding; within minutes, such stimulation could establish compulsive lever-pressing at rates reaching several thousand responses per hour (12). Experience earning such stimulation also established conditioned motivation to approach the lever, and this motivation could be sufficient to overcome painful footshock (12). Thus this stimulation served as an unconditioned reinforcer, “stamping in” response habits as well as stimulus associations that established the response lever as a conditioned incentive stimulus that elicited approach and manipulation. From the earliest studies it was inferred that the rats liked the stimulation and that liking it made them want more (10); studies of stimulation in human patients confirmed that such stimulation was pleasurable (13).
Stimulation of this region could also motivate behavior. Early work of Hess had revealed that electrical brain stimulation could induce compulsive feeding, characterized as “bulimia” (14). Following the discovery of brain stimulation reward (15), it was soon discovered that stimulation in the lateral hypothalamus could induce such feeding as well as reward (16). Indeed, stimulation at reward sites can induce a variety of species-typical, biologically primitive behaviors such as eating, drinking, predatory attack, and copulation (17). In many ways, the effects of stimulation are similar to the effects of natural drives states (18), and the effects of stimulation and food deprivation are known to summate (19). This, then, was the drive-reward paradox (20); why should a rat press a lever to induce a state like hunger?
Medial forebrain bundle fibers of passage
Historically, the first question prompted by the drive-reward paradox was whether the same or different lateral hypothalamic substrates are involved in the two effects of stimulation. This was not an easy possibility to rule out because electrical stimulation activates different neurotransmitter systems rather indiscriminately. The effective zone of stimulation is perhaps a millimeter in diameter (21, 22) and within this zone the stimulation tends to activate whatever fibers surround the electrode tip. However, fibers of different size and myelination have different excitability characteristics, and the stimulation parameters used for the two behaviors were somewhat different (23, 24). While it was the bed nucleus of the lateral hypothalamus that were initially thought to be the primary source of hunger and reward, fibers of passage have much lower activation thresholds than those of cell bodies, and the bed nucleus of the lateral hypothalamus is traversed by over 50 fiber system comprising the medial forebrain bundle (25, 26). The origin, immediate target, and neurotransmitter of the directly activated pathway (or pathways) for brain stimulation reward and stimulation-induced feeding remain unidentified, but fibers of passage are clearly implicated and several of their characteristics have been determined. The substrates of the drive-like and the rewarding effects of lateral hypothalamic stimulation have very similar characteristics.
First, anatomical mapping has revealed that the lateral hypothalamic substrate for brain stimulation reward and for stimulation-induced eating have very similar medial-lateral and dorsal-ventral boundaries and are homogeneous within those boundaries (27, 28). Moreover, whereas only the lateral hypothalamic portion of the medial forebrain bundle was initially identified with feeding and reward, stimulation of more caudal projections of the bundle, in the ventral tegmental area, can also both be rewarding (29–31) and induce feeding (32–34). Within the ventral tegmental area, the boundaries of the effective stimulation sites match closely the boundaries of the dopamine cell groups that form the mesocorticolimbic and nigrostriatal dopamine systems (30). Stimulation of the cerebellar peduncle (an even more caudal branch of the medial forebrain bundle) can also support both self-stimulation and feeding (35, 36). Thus if separate substrates mediate the two behaviors, those substrates have remarkably similar anatomical trajectories and perhaps similar subcomponents.
While not allowing differentiation of neurotransmitter content, psychophysical methods—assessing the behavioral effects of systematic variations of the stimulation input—allow a significant degree of differentiation between axonal characteristics. The methods are not widely discussed in the addiction or feeding literatures.
First, “paired-pulse” stimulation has been used to estimate the refractory periods and conduction velocities of the “first stage” fibers (the reward- and feeding-relevant fiber populations that are directly activated by the applied current at the tip of the electrode). The method for estimating refractory periods—the time required for the neuronal membrane to recharge after the depolarization of an action potential—is based on the method used by electrophysiologists studying single neurons. While there are some subtleties to be considered in practice, the method is very straightforward in principle. When studying single neurons, one simply stimulates the neuron twice, varying the interval between the first and second stimulations in order to find the minimum interval that still allows the cell to respond to the second stimulation. If the second stimulation follows the first too quickly, the neuron will not have recovered from the effects of the first in time to respond to the second. If the second pulse comes late enough, the neuron will have recovered sufficiently from the firing caused by the first pulse to fire again in response to the second. The minimum inter-pulse interval for obtaining responses to both pulses defines the “refractory period” of the stimulated axon.
In order to obtain behavioral responses to moderate levels of electrical stimulation, more than fiber must be stimulated and more than one stimulation pulse must be given; higher levels of stimulation are given to reach many fibers around the electrode, and “trains” of repeated stimulation pulses are needed to activate these several times. In self-stimulation studies the stimulation trains of 0.5 seconds are traditionally given; in stimulation-induced feeding studies stimulation trains of 20 or 30 seconds are given. Each pulse within a train typically lasts only 0.1 msec: long enough to activate nearby neurons once but not long enough for them to recover and fire a second time during the same pulse. The pulses are usually given at frequencies of 25–100 Hz, so that even in a half-second stimulation train there are dozens of repeated pulses. A simple train of stimulation pulses is diagrammed in Figure 1A.
To determine refractory periods of the first-stage neurons, trains of paired pulses (Fig. 1B), rather than trains of single pulses (Fig. 1A), are given. The first pulse in each pair is termed a “C” or “conditioning” pulse; the second pulse in each pair is termed a “T” or “test” pulse (Fig. 1C). If the C-pulses are followed too closely by their respective T-pulses, the T-pulses will be ineffective and the animal will respond as if it received only the C-pulses. If the interval between the C- and T-pulses is extended sufficiently, the T-pulse will become effective and the animal, receiving more reward, will respond more vigorously. Because the population of first stage neurons has a range of refractory periods, the behavioral responses to stimulation begins as the C-T interval reaches the refractory period of the fastest relevant fibers, and improves as the C-T intervals are extended until they exceed the refractory period of the slowest fibers (Fig. 1D). Thus the method gives us the refractory period characteristics of the population or populations of first-stage neurons for the behavior in question.
As shown by such methods, the absolute refractory periods for the fibers mediating lateral hypothalamic brain stimulation reward range from about 0.4 to about 1.2 msec (37–40). The absolute refractory periods for stimulation-induced feeding are also in this range (38, 40). Not only are the refractory period ranges for the two populations similar; the two distributions have a similar anomaly: in each case, they show no behavioral improvement when C-T intervals are increased between 0.6 and 0.7 msec (39, 40). This suggests that there are two sub-populations of fibers contributing to each behavior: a small sub-population of very fast fibers (refractory periods ranging from 0.4 to 0.6 msec) and a larger sub-population of slower fibers (refractory periods ranging from 0.7 to 1.2 msec or perhaps a bit longer). It is difficult to imagine that different populations mediate the rewarding and the drive-like effects of stimulation when the refractory period profiles are so similar, each with a discontinuity between 0.6 and 0.7 msec.
Additional evidence for a common substrate for the drive and reward effects of stimulation is that stimulation at sites elsewhere along the medial forebrain bundle can also induce both feeding (32–34, 40, 41) and reward (29, 42–44). The refractory period distributions for reward and stimulation-induced feeding are the same whether the stimulating electrodes are at the ventral tegmental or the lateral hypothalamic level of the medial forebrain bundle (40). This strongly suggests that the same two sub-populations of fibers of passage are responsible for both behaviors.
Further, once the trajectory of the fibers mediating a stimulation effect has been partly identified, the conduction velocities of the first stage fibers for the two behaviors can be determined and compared (43). The method for estimating the conduction velocities is similar to that for estimating refractory periods, but in this case the C-pulses are delivered at one stimulation site along the fiber path (e.g., the lateral hypothalamus) and the T-pulses are delivered at another (e.g., the ventral tegmental area). This requires stimulating electrodes that are aligned to depolarize the same axons at two points along their length (45). When a pair of electrodes is found to be optimally aligned along the fibers for reward, they turn out also to be optimally aligned along the fibers for stimulation-induced feeding (33). Here, when paired pulses are given, a longer interval between the C-pulses and the T-pulses must be allowed before the T-pulses will be effective. This is because, in addition to the time for recovery from refractoriness, time must be allowed for conduction of the action potential from one electrode tip to the other (43, 45). By subtracting the refractory period (determined by single electrode stimulation) from the critical C-T interval for pulses given at the different electrodes, we can estimate the range of conduction times and derive the range of conduction velocities for the population of first-stage fibers. Studies using this method have shown that the fibers for stimulation-induced reward have the same or very similar conduction velocities as the fibers for stimulation-induced feeding (33). Thus the drive-reward paradox is not easily resolved on the basis of the boundaries, refractory periods, conduction velocities, or path of conduction of the substrates for the rewarding and drive-inducing effects of lateral hypothalamic electrical stimulation; rather, it appears that the mechanism for the drive effects triggered by medial forebrain bundle stimulation is either the same or remarkably similar to the mechanism for the reinforcing effects of stimulation.
Pharmacological evidence further suggests a common substrate for brain stimulation reward and stimulation-induced feeding; this evidence suggests the common involvement of dopamine neurons, neurons that do not have the refractory period and conduction velocity characteristics of the first-stage fibers of the medial forebrain bundle but are presumably second-stage or third stage fibers downstream from the directly activated fibers. First, stimulation-induced feeding and lateral hypothalamic brain stimulation reward are each attenuated by dopamine antagonists (46–51). In addition, each is facilitated by ventral tegmental injections of morphine (52, 53) and mu and delta opioid agonists (54, 55) that activate the dopamine system (56). Similarly, both are facilitated by delta-9 tetrahydrocannabinol (57–59). While amphetamine is an anorexigenic drug, even it potentiates aspects of stimulation-induced feeding (60) as well as brain stimulation reward (61), particularly when it is microinjected into nucleus accumbens (62, 63).
Interactions with the dopamine system
How do the first-stage fibers of brain stimulation reward interact with the dopamine system? Another two-electrode stimulation study suggests that the first-stage fibers project caudally from somewhere rostral to the lateral hypothalamic area, toward or through the ventral tegmental area where the dopamine system originates. Stimulation is again applied using two electrodes aligned to influence the same fibers at different points along their length, but in this case one of the electrodes is used as a cathode (injecting positive cations) to locally depolarize axons at the electrode tip and the other is used as the anode (collecting the cations) to hyperpolarize the same axons at a different point along their length. Since the nerve impulse involves the movement down the axon of a zone of phasic depolarization, the impulse fails if it enters a zone of hyperpolarization. When the anodal stimulation blocks the behavioral effects of cathodal stimulation it means the anode is between the cathode and the nerve terminal. By switching the cathodal stimulation and anodal blockade between the two electrode sites and determining which configuration is behaviorally effective, we can determine the direction of conduction of the first-stage fibers. This test indicates that the bulk of the stimulated fibers conduct reward messages in the rostral-caudal direction, toward the ventral tegmental area (64). While the origin or origins of the system remain to be determined, one hypothesis is that the descending first-stage fibers terminate in the ventral tegmental area, synapsing on the dopaminergic cells there (65); another hypothesis is that the first stage fibers pass through the ventral tegmental area and terminate in the pedunculopontine tegmental nucleus, which relays back to the dopamine cells (66). Either way, a good deal of evidence suggests that the same or very similar subpopulations of medial forebrain bundle fibers (67) carry both the rewarding effects and also the drive-inducing effects of lateral hypothalamic stimulation caudally toward the ventral tegmental area, and that the dopamine neurons of the ventral tegmental area are a critical link in the final common path for both stimulation effects.
Drug-induced feeding and reward
The drive-reward paradox is not unique to studies of behavior induced by electrical stimulation; another example involves behavior induced by microinjections of drugs. For example, rats will lever-press or nose-poke to administer microinjections of morphine (68, 69), or the endogenous mu opioid endomorphin (70) into the ventral tegmental area; they also learn to self-administer the selective mu and delta opioids DAMGO and DPDPE into this brain region (71). The mu and delta opioids are rewarding in proportion to their abilities to activate the dopamine system; mu opioids are over 100 times more effective than delta opioids in activating the dopamine system (56) and, similarly, are over 100 times more effective as rewards (71). Thus mu and delta opioids have rewarding actions attributed to activation (or, more likely, disinhibition ) of the origins of the mesocorticolimbic dopamine system. Direct injections of opioids into the ventral tegmental area also stimulate feeding in satiated rats and enhance it in hungry ones. Feeding is induced by ventral tegmental injections of either morphine (73–75) or mu or delta opioids (76, 77). As is the case with their rewarding effects, the mu opioid DAMGO is 100 or more times more effective than the delta opioid DPDPD in stimulating feeding (77). Thus once again, reward and feeding can each be stimulated by manipulating a common brain site, using, in this case, drugs that are much more selective than electrical stimulation for activating specific neural elements.
Another example involves agonists for the neurotransmitter GABA. Microinjections of GABA or the GABAA agonist muscimol into the caudal but not the rostral portion of the ventral tegmental area induce feeding in sated animals (78). Similarly muscimol injections in the caudal but not the rostral ventral tegmental area are rewarding (79). GABAA antagonists are also rewarding (80), and cause nucleus accumbens dopamine elevations (81); in this case the effective injection site is the rostral and not the caudal ventral tegmental area, suggesting opposing rostral and caudal GABAergic systems. Feeding has not yet been examined with GABA-A antagonists in these regions.
Finally systemic cannabinoids (82) and cannabinoids microinjected into the ventral tegmental area (83) are reinforcing in their own right and systemic cannabinoids also potentiate the feeding induced by lateral hypothalamic electrical stimulation (84). Again, we find injections that are both rewarding and also motivational for feeding. Again, the mesocorticolimbic dopamine system is implicated; in this case the cannabinoids are effective (as rewards, at least) in the ventral tegmental area, where they interact with inputs to the dopamine system and result in its activation (85, 86).
The studies reviewed above implicate a descending system in the medial forebrain bundle in the yin and yang of motivation: the motivation to action by the promise of a reward before it has been earned and the reinforcement of recent response and stimulus associations by the timely receipt of reward, once obtained. This system projects caudally from the lateral hypothalamus toward the dopamine system—presumably synapsing either on it or on inputs to it—which plays a significant, (though perhaps not necessary (87, 88)), role in the expression of both this motivation (46) and this reinforcement (50).
How might the dopamine system, a system implicated in both the habit-forming consequences of food addictive drug consumption, be involved as well in the antecedent motivation to obtain these rewards? The most obvious possibility is that different dopamine subsystems might subserve these different functions. That subsystems might serve different functions is suggested, first, by the nominal differentiation of nigrostriatal, mesolimbic, and mesocortical systems and by subsystems within them. The nigrostriatal system is traditionally associated with the initiation of movement, whereas the mesolimbic system is more traditionally associated with reward (89, 90) and motivational (91) function (but see ). The mesocortical system is also implicated in reward function (93–95). The ventromedial (shell), ventrolateral (core) and dorsal striatum—major dopamine terminal fields—are differentially responsive to different kinds of rewards and reward predictors (96–101). That different subsystems might serve different functions is further suggested by the fact that there are two general classes of dopamine receptor (D1 and D2) and two striatal output pathways (direct and indirect) that selectively express them. Another interesting possibility, however, is that the same dopamine neurons might subserve the different states by using different neuronal signaling patterns. Perhaps the most interesting distinction of interest is the distinction between two activity states of dopamine neurons: a tonic pacemaker state and a phasic bursting state (102).
It is the phasic bursting state of dopamine neurons that has the temporal fidelity to signal the arrival of rewards or reward predictors (103). Dopamine neurons burst with short latency when rewards or reward-predictors are detected. Because dopamine neurons respond to rewards themselves only when they are unexpected, shifting their response to the predictors as the prediction becomes established, it has become frequent to see reward and reward-prediction treated as independent events (103). An alternative view is that the predictor of a reward, through Pavlovian conditioning, becomes a conditioned reinforcer and a conditioned component of the net rewarding event (104): indeed, that it becomes the leading edge of the reward (105, 106). It is the habit-forming effect of rewards—whether they be unconditioned or conditioned rewards (reward-predictors)—that requires short-latency, phasic, response-contingent delivery. Rewards delivered immediately after a response are much more effective than rewards delivered even one second later; reward impact decays hyperbolically as a function of delay after the response that earns it (107). Phasic activation of the dopamine system is known to be triggered by two excitatory inputs: glutamate (108) and acetylcholine (109). Each of these participates in the rewarding effects of earned cocaine: glutamatergic and cholinergic input to the dopamine system are each triggered by the expectancy of cocaine reward, and each of these inputs adds to the net rewarding effects of cocaine itself (110, 111).
On the other hand, it is slow changes in the tonic pacemaker firing of dopamine neurons and the changes in extracellular concentration of dopamine that accompany them that are more likely to be associated with changes in motivational state that accompany cravings for food or drugs. Unlike reinforcement, motivational states do not depend on short-latency and response-contingent timing. Motivational states can build gradually and can be sustained for long periods, and these temporal characteristics are most likely to reflect slow changes in the rate of pacemaker firing of dopamine neurons and slow changes in extracellular dopamine levels. The motivational effects of elevating dopamine levels (112) are perhaps best illustrated in the response reinstatement paradigm of food and drug self-administration (113), where animals that have undergone extinction training can be provoked by mild footshock stress, food or drug priming, or food-or drug-related sensory cues to renew food- or drug-seeking. Each of these provocations—footshock stress (114), food (115) or drug (116) priming, and food- (97) or drug- (110, 111, 116) related cues—elevates extracellular dopamine levels for minutes or tens of minutes. Thus changes in pacemaker firing of dopaminergic neurons are the likely correlate of the motivation to initiate learned responses for food or addictive drugs.
While explanations of the drive-reward paradox remain unconfirmed, the studies reviewed above strongly suggest that drive and reward functions are mediated by a common system of descending medial forebrain fibers that, directly or indirectly, activates the midbrain dopamine systems. The simplest hypothesis is that dopamine serves a general arousal function that is essential for both drive and reinforcement. This is consistent with the fact that extracellular dopamine is essential for all behavior, as confirmed by the akinesia of animals with near-total dopamine depletions (117). Response-independent tonic increases in extracellular dopamine levels (associated with increased tonic firing of the dopamine system) cause increases in general locomotor activity, perhaps simply by increasing the salience of novel and conditioned stimuli that elicit Pavlovian investigatory and learned instrumental responses (118–120). In this view, increases in tonic dopamine levels elicited by food- or drug-predictive stimuli are the frequent correlate of subjective cravings or “wantings.” Response-contingent increases in momentary dopamine levels associated with phasic firing of the dopamine system stamp in stimulus and response associations, presumably by enhancing consolidation of the still-active trace that mediates the short-term memory of these associations (121, 122). While this view holds that extracellular dopamine fluctuations mediate both drive and reinforcement effects, it holds that the reinforcement effects are primary; it is only after the sight of food or a response lever has been associated with the reinforcing effects of that food or an addictive drug that the food or lever becomes an incentive motivational stimulus that can itself stimulate craving and elicit approach. The argument here is that it is yesteday’s reinforcing effects of a particular food or drug that establishes today’s cravings for that food or drug.
It is not just that the overeating of high-energy foods becomes compulsive and is maintained in the face of negative consequences that suggests that overeating takes on properties of addiction. It is difficult to imagine how natural selection would have resulted in a separate mechanism for addiction when enriched sources of the drugs and the ability to smoke or inject them in high concentration are relatively recent events in our evolutionary history. The foraging for drugs and foraging for food require the same coordinated movements and thus their mechanisms share a final common path. They are each associated with subjective cravings and they are each subject to momentary satiety. Each involves forebrain circuitry that contributes importantly to both motivation and reinforcement, circuitry strongly implicated in establishing compulsive instrumental habits (12, 123–125). While there is a good deal of interest in what we can learn about obesity from studies of addiction (126), it will also be interesting to see what we can learn about addiction from studies of obesity and food intake. For example, hypothalamic orexin/hypocretin neurons have suggested roles in feeding (127) and reward (128) and it is known that brain stimulation reward (129), like food reward (130) can be modulated by the peripheral satiety hormone leptin. New optogenetic methods (131) allow much more selective activation of motivational circuitry than does electrical stimulation, and it is hoped that these methods can advance our understanding of compulsive drug-taking and compulsive overeating and resolve the drive-reward paradox.
Preparation of this manuscript was supported in the form of salary by the Intramural Research Program, National Insititute on Drug Abuse, National Institutes of Health.
The author reports no biomedical financial interests or potential conflicts of interest.
Publisher’s Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.