Escolar Documentos
Profissional Documentos
Cultura Documentos
Edited by Mortimer Mishkin, National Institutes of Health, Bethesda, MD, and approved September 16, 2009 (received for review April 12, 2009)
Theories of instrumental learning aim to elucidate the mechanisms reinforce approach pathways, while dopamine dips (encoding
that integrate success and failure to improve future decisions. One negative prediction errors) reinforce avoidance pathways (10).
computational solution consists of updating the value of choices in Instrumental learning may involve both conscious and sub-
proportion to reward prediction errors, which are potentially conscious processes. We recently demonstrated that healthy
encoded in dopamine signals. Accordingly, drugs that modulate subjects can learn associations between cues and choice out-
dopamine transmission were shown to impact instrumental learn- comes, even if the cues are masked and hence not consciously
ing performance. However, whether these drugs act on conscious perceived (11). During performance of this subliminal condi-
or subconscious learning processes remains unclear. To address this tioning task, prediction errors generated with a standard rein-
issue, we examined the effects of dopamine-related medications in forcement learning algorithm were reflected in striatal activity,
a subliminal instrumental learning paradigm. To assess generality possibly due to dopaminergic inputs. However, the assumption
of dopamine implication, we tested both dopamine enhancers in that subconscious learning is actually driven by dopamine release
Parkinson’s disease (PD) and dopamine blockers in Tourette’s in the striatum remains to be tested. It is noteworthy that
syndrome (TS). During the task, patients had to learn from mon- learning is dramatically reduced in the subliminal compared to
etary outcomes the expected value of a risky choice. The different the unmasked condition, where the associations can be trivially
outcomes (rewards and punishments) were announced by visual acquired in one trial. Thus, conscious processes, notably the
cues, which were masked such that patients could not consciously ability to keep in mind the cues and outcomes seen previously,
perceive them. Boosting dopamine transmission in PD patients seem important for a good learning performance, but are not
improved reward learning but worsened punishment avoidance. necessary for a more limited acquisition of instrumental
Conversely, blocking dopamine transmission in TS patients favored
responses.
punishment avoidance but impaired reward seeking. These results
To our knowledge, the question of whether dopamine-related
drugs affect conscious or subconscious learning-related pro-
thus extend previous findings in PD to subliminal situations and to
cesses has not been addressed so far. Here, we examined this
another pathological condition, TS. More generally, they suggest
issue by administrating our subliminal conditioning paradigm to
that pharmacological manipulation of dopamine transmission can
PD patients. The hypothesis was that the above-mentioned
subconsciously drive us to either get more rewards or avoid more
double dissociation, between reinforcement valence (reward or
punishments.
punishment) and medication status (off or on levodopa), could
be replicated in subliminal conditions. To strengthen the dem-
dopamine 兩 instrumental learning 兩 subliminal perception 兩 reward 兩
onstration, we also tested whether a reverse double dissociation
punishment
could be observed in patients with Gilles de la Tourette’s
syndrome (TS), which can be opposed to PD in terms of both
NEUROSCIENCE
such that better decisions are made in the future. A basic syndrome alleviated by dopamine receptor agonists. Medication
learning mechanism consists of updating the value of the chosen effects were assessed between two groups of 12 TS patients on
option according to a reward prediction error, which is the one hand and within one group of 12 PD patients on the other.
difference between the actual and the expected reward (1, 2). Matched healthy controls (24 young and 12 older subjects) were
This learning rule, using prediction error as a teaching signal, has also administrated to the same experimental paradigm. Disease
provided a good account of instrumental learning in a variety of effects were assessed by comparing each group of patients off
species including both human and nonhuman primates (3, 4). medication with their matched control group. Subjects’ demo-
Single-cell recordings in monkeys suggest that reward prediction graphic and clinical features are displayed in Tables 1 and 2,
errors are encoded by the phasic discharge of dopamine neurons respectively.
(5, 6). In humans, dopamine-related drugs have been shown to The subliminal conditioning task used three abstract cues that
bias prediction error encoding in the striatum to modulate were paired with different monetary outcomes (⫺1€, 0€, ⫹1€).
reward-based learning (7). One of these drugs, levodopa (a
metabolic precursor of dopamine), is used to alleviate motor
Author contributions: D.G., A.H., and M.P. designed research; S.P. and M.L. performed
symptoms in idiopathic Parkinson’s disease (PD), which is research; Y.W., D.G., and A.H. contributed new reagents/analytic tools; S.P. and M.P.
primarily caused by degeneration of nigral dopamine neurons. analyzed data; and S.P. and M.P. wrote the paper.
PD patients were shown to learn better from positive feedback The authors declare no conflict of interest.
when on levodopa and from negative feedback when off levo- This article is a PNAS Direct Submission.
dopa (8, 9). This double dissociation lead Frank and colleagues 1To whom correspondence should be addressed. E-mail: mathias.pessiglione@gmail.com.
to propose a computational model of fronto-striatal circuits This article contains supporting information online at www.pnas.org/cgi/content/full/
where dopamine bursts (encoding positive prediction errors) 0904035106/DCSupplemental.
www.pnas.org兾cgi兾doi兾10.1073兾pnas.0904035106 PNAS 兩 November 10, 2009 兩 vol. 106 兩 no. 45 兩 19179 –19184
Table 1. Demographic data
Demographic features PD (n ⫽ 12) Seniors (n ⫽ 12) TS Off (n ⫽ 12) TS On (n ⫽ 12) Juniors (n ⫽ 24)
Age (years) 57.0 ⫾ 3.1 60.7 ⫾ 2.7 21.3 ⫾ 2.6 19.8 ⫾ 2.6 22.3 ⫾ 0.9
Sex (female/male) 1/11 5/7 3/9 2/10 12/12
Education (years) 10.3 ⫾ 1.3 16.4 ⫾ 1.0 11.3 ⫾ 1.4 10.0 ⫾ 0.9 15.1 ⫾ 0.5
The cues were briefly flashed between two mask images, after occasional conscious perception. To address this issue, we
which subjects had to choose between safe and risky options calculated correlations between d⬘ and payoffs: Pearson’s coef-
(Fig. 1). The safe choice means a null outcome for sure: no gain, ficients were around zero and nonsignificant (PD Off, r ⫽ 0.22,
no loss. A risky choice may result in a gain (⫹1€), a loss (⫺1€), P ⬎ 0.5; PD On, r ⫽ 0.17, P ⬎ 0.5; TS Off, r ⫽ ⫺0.13, P ⬎ 0.1;
or a neutral outcome (0€), depending on the cue. As they would TS On, r ⫽ ⫺0.29, P ⬎ 0.5), suggesting that learning effects
not see the cues, subjects were encouraged to follow their were not driven by patients with above-chance discrimination
intuition: to make a risky choice if they had the feeling they were performance.
in a winning trial or to make a safe choice if they felt it was a After controlling for these potential confounding effects, we
losing trial. For half of the subjects, the risky response was a next examined the hypothesized double dissociation between
‘‘Go’’ (key press), and for the other half it was a ‘‘Nogo’’ (no key reinforcement valence and medication status. We distinguished
press). Thus the experimental design allowed measuring depen- between reward and punishment learning in the calculation of
dent variables for three orthogonal dimensions: the rate of Go monetary payoffs. Relative to the neutral condition, additional
response (motor impulsivity), risky choice (cognitive impulsiv- correct choices were considered as an index of reward learning
ity), and monetary payoff (reinforcement learning). Note that if in the gain condition and as an index of punishment learning in
subjects always made the same response, or if they performed at the loss condition. Note that subtracting the neutral condition
chance, their final payoff would be zero. Hence a positive payoff removes the potential effects of motor and cognitive impulsivity.
indicates that some representation of cue–outcome contingen- The number of correct choices was expressed as euros that
cies had been acquired through conditioning. A separate visual subjects won for reward learning or avoided losing for punish-
discrimination task was subsequently conducted to assess the ment learning (Fig. 2A).
subjects’ sensitivity to differences between cues, presented with As expected, we observed that off-medication PD patients
the same masking procedure as during conditioning. The ratio- significantly learned to avoid punishments (1.3 ⫾ 0.5€, t11 ⫽ 2.8,
nale is that if subjects are unable to discriminate between cues, P ⬍ 0.01, one-tailed t test) but not to get rewards (⫺0.3 ⫾ 0.7€,
then they are a fortiori unable to build conscious representations t11 ⫽ ⫺0.5, P ⬎ 0.1, one-tailed t test). On-medication PD patients
of cue–outcome associations. exhibited the opposite pattern: no punishment learning (⫺0.3 ⫾
0.5€, t11 ⫽ ⫺0.6, P ⬎ 0.1, one-tailed t test) but significant reward
Results learning (1.5 ⫾ 0.5€, t11 ⫽ 2.9, P ⬍ 0.01, one-tailed t test). The
All dependent measures in the different groups have been reverse double dissociation was observed in TS patients: When
summarized in Table 3. We first tested motor and cognitive off medication, they learned to obtain rewards (1.9 ⫾ 1.0€, t11 ⫽
impulsivity measures (Go response and risky choice). There was 2.0, P ⬍ 0.05, one-tailed t test) but not to avoid punishments
no significant difference between PD and TS groups (all P ⬎ 0.1, (0.0 ⫾ 0.5€, t11 ⫽ 0.1, P ⬎ 0.5, one-tailed t test) and when on
two-tailed t tests) and no significant effect of medication, either medication, they failed to obtain rewards (0.1 ⫾ 0.4€, t11 ⫽ 0.3,
in PD or TS (all P ⬎ 0.05, two-tailed t tests). These results were P ⬎ 0.1, one-tailed t test) but successfully avoided punishments
not necessarily expected given the motor and cognitive signs (1.6 ⫾ 0.5€, t11 ⫽ 3.0, P ⬍ 0.01, one-tailed t test). Having
associated with the diseases and treatments, but they suggest that identified the combinations of medication status and reinforce-
performance was not driven by a difficulty in pressing keys or a ment valence where patients did learn, we checked the correla-
propensity to take risks. tions between d⬘ and learning in these situations (Fig. 2B). They
Then we examined learning performance (monetary payoff) were again close to zero and not significant in both PD patients
and discrimination sensitivity (d⬘). Monetary payoffs were sig- (Off/punishment, r ⫽ 0.01, P ⬎ 0.5; On/reward, r ⫽ 0.01, P ⬎ 0.5)
nificantly above zero, indicating a conditioning effect, in both and TS patients (Off/reward, r ⫽ ⫺0.20, P ⬎ 0.5; On/
PD and TS patients (PD, 1.1 ⫾ 0.5€, t11 ⫽ 2.1, P ⬍ 0.05; TS, 1.8 ⫾ punishment, r ⫽ ⫺0.29, P ⬎ 0.5). Moreover, regression lines
0.5€, t23 ⫽ 3.7, P ⬍ 0.001, one-tailed t test). In contrast, crossed the y axis (d⬘ ⫽ 0) for positive payoffs in all situations,
performance did not improve in the visual discrimination test, demonstrating the presence of conditioning effects in the ab-
where subjects remained at chance level throughout the entire sence of visual discrimination.
series of trials [see Fig. S1]. As the impulsivity measures, payoffs To verify that the double dissociations were due to difference
and d⬘ were not affected by dopamine enhancers in PD or by in learning rates, we plotted the cumulative money won (for
dopamine blockers in TS (all P ⬎ 0.1, two-tailed t test). Note, reward learning) and not lost (for punishment learning) as a
however, that d⬘ were numerically above zero in all situations, function of trials (Fig. 3B). Linear regression coefficients
suggesting that learning effects may have been driven by some (slopes) of these learning curves were extracted and tested for
Disease duration (years) 10.7 ⫾ 1.2 Disease duration (years) 13.7 ⫾ 2.9 12.3 ⫾ 2.8
UPDRSIII score Off 28.7 ⫾ 4.5 YGTSS/50 score 15.9 ⫾ 1.6 18.3 ⫾ 2.1
UPDRSIII score On 6.9 ⫾ 1.6 YGTSS/100 score 33.4 ⫾ 3.8 42.4 ⫾ 4.0
Treatment Levodopa* Treatment — Risperidone Primozide
Daily dose (mg/day) 850 ⫾ 116 Daily dose (mg/day) — 2.3 ⫾ 0.7 3.3 ⫾ 2.3
*Dose is expressed as dopa-equivalent, taking into account both levodopa (all patients) and dopamine agonists (seven patients).
NEUROSCIENCE
Medication effects on the reward bias therefore appear much more
and a higher one when on medication. And relative to healthy reliable than disease effects.
young subjects, TS patients had a higher reward bias when off
medication and a lower one when on medication. However, the Discussion
differences being smaller than when comparing on and off states, To summarize, we extended the double dissociation between
the comparison with control subjects was significant only for Off PD reinforcement valence and dopamine medication status, which
Monetary payoff (€) 1.0 ⫾ 0.8 1.3 ⫾ 0.6 0.6 ⫾ 0.6 1.9 ⫾ 0.6 1.8 ⫾ 0.8 2.8 ⫾ 0.9
Visual discrimination (d⬘) 0.14 ⫾ 0.25† 0.33 ⫾ 0.15 0.43 ⫾ 0.14 0.37 ⫾ 0.11 0.07 ⫾ 0.14 0.05 ⫾ 0.12
Payoff/d⬘ correlation (r) 0.22 0.17 ⫺0.29 ⫺0.13 0.29 0.22
Go responses (%) 50.9 ⫾ 6.4 47.5 ⫾ 6.2 48.1 ⫾ 2.5 51.2 ⫾ 5.1 49.6 ⫾ 3.8 46.7 ⫾ 3.8
Risky choices (%) 70.1 ⫾ 2.4 55.6 ⫾ 6.0 67.7 ⫾ 2.2 65.7 ⫾ 1.9 58.8 ⫾ 2.8 63.4 ⫾ 2.6
Reward obtained (€) ⫺0.3 ⫾ 0.7 1.5 ⫾ 0.5 0.9 ⫾ 0.4 1.9 ⫾ 1.0 0.1 ⫾ 0.4 1.5 ⫾ 0.8
Punishment avoided (€) 1.3 ⫾ 0.5* ⫺0.3 ⫾ 0.5 ⫺0.2 ⫾ 0.4 0.0 ⫾ 0.5 1.6 ⫾ 0.5 1.3 ⫾ 0.7
Palminteri et al. PNAS 兩 November 10, 2009 兩 vol. 106 兩 no. 45 兩 19181
most subliminal perception studies, the cues were never shown
until the debriefing at the end of the experiment. Although they
did not provide the above criteria for absence of awareness, some
previous studies in PD reported deficits in implicit learning (8,
9, 14, 15). In these paradigms the cues are consciously perceived,
but subjects fail to report explicitly the cue–outcome contin-
gencies at debriefing, even if they previously expressed some
knowledge of these contingencies in their motor responses.
Debriefing tests have, however, been criticized as confounded by
memory decay (16–18), so masking cues serves as a more
stringent approach to limit conscious associations between cues
and outcomes. Compared to implicit learning paradigms, such as
probabilistic classification or transitive inference tasks, the Go/
Nogo mode of response used here makes reinforcement learning
more direct, with no need for building high-level representations
of cue–outcome contingencies.
Our findings are in line with a growing body of evidence that
reinforcement learning can operate subconsciously (19–23).
More specifically, they extend a previous functional neuroimag-
ing study using the same subliminal conditioning paradigm (11),
which showed that reward prediction errors were reflected in the
ventral striatum. A parsimonious explanation may be that do-
pamine enhancers and blockers, because they interfere with
dopamine transmission, modulate the magnitude of prediction
error signals, as was previously demonstrated during conscious
instrumental learning (7). This would be compatible with
Frank’s model (10), if we assume that dopamine enhancers and
Fig. 3. Learning rates. (Left) Idiopathic Parkinson’s disease (PD) patients. blockers have opposite effects both on positive prediction errors
(Right) Gilles de la Tourette’s syndrome (TS) patients. (A) Accumulation rates. following rewards and on negative prediction errors following
Histograms in each graph show linear regression coefficients of corresponding punishments. The drugs may impact the reinforcement of
learning curves below. Solid histograms represent medicated patients (on
fronto-striatal synapses, which allegedly underlies the formal
dopamine enhancers or blockers) whereas open histograms show unmedi-
cated patients. Error bars are plus or minus between-subjects standard errors
process of using prediction error as a teaching signal to update
of the mean. (B) Accumulation curves. Graphs represent for each individual the value of the current cue, according to Rescorla and Wagner’s
the cumulative sum of euros won (reward learning) or not lost (punishment rule (1). At a lower level, the underlying mechanisms remain
learning) as a function of trials. The curves have been averaged across sessions speculative, however, as it is unclear which dopamine receptors
and subjects. Medicated patients (on dopamine enhancers or blockers) are (D1, D2, or others) and which component of dopamine release
represented by solid squares and solid regression lines and unmedicated (tonic, phasic, or a combination of both) are impacted by
patients by open squares and dashed lines. medications. Although we argue that the reinforcement process
modulated by medications was subconscious, we do not imply
that conscious feelings, when seeing the masks or the outcomes,
was originally demonstrated in PD patients by Frank and col-
remained unaffected. It remains, for instance, possible that
leagues (8), to the subliminal case and to TS patients. In short,
subjects, even if not perceiving the cue itself, had a conscious
reinforcement learning was biased toward reward seeking when
positive feeling following a reward-predicting cue or a negative
boosting dopamine transmission and toward punishment avoid-
one after a punishment-predicting cue. Further experiments are
ance when blocking dopamine transmission. The effects were needed to determine whether we can develop a conscious access
independent from factors such as discrimination sensitivity and to the value of cues that we do not consciously perceive.
motor or cognitive impulsivity, which were orthogonal to the The replication of the double dissociation in a second patho-
reinforcement valence in our design. Moreover, these factors logical condition (TS) suggests that our manipulation tapped
were not significantly affected by medication, suggesting that into general dopamine-related mechanisms and not into peculiar
patients did not perceive the cues, press the button, or choose the dysfunction restricted to PD. Our findings potentially facilitate
risky response any more in the on- than in the off-medication understanding not only dopamine-related drug effects but also
state. dopamine-related disorders. The case for dopamine neuron
Despite the use of short duration and backward masking, we degeneration in PD is well established (24), so from Frank’s
cannot formally ensure that all cues remained subliminal in all model (10) it could be predicted that off-medication PD patients
trials, as there is no direct window to the conscious mind. We were impaired in reward learning but not in punishment avoid-
nonetheless provide standard criteria that are generally consid- ance. A lack of positive reinforcement following rewards might
ered as indirect evidence for nonconscious perception (12, 13). explain action selection deficits that are frequently reported in
Verbal reports were recorded to assess the subjective criterion: PD (14, 15, 25). Indeed, if an action is not reinforced when
When shown the unmasked cues, all subjects reported not having rewarded, selection of that action will not be facilitated in the
seen them previously. Discrimination performance was mea- future. A deficit in movement selection could also account for
sured to assess the objective criterion: Learning effects were some motor symptoms, such as akinesia and rigidity, that are the
obtained even for a null d⬘, which indicates that subjects were hallmarks of PD. The double dissociation evidenced in PD may
unable to correctly decide whether two consecutive cues were the also provide insight into compulsive behaviors, such as patho-
same or different. We therefore conclude that the learning logical gambling, induced in these patients by dopamine agonists
processes affected by medications were largely subconscious. (26, 27). The explanation would be that due to dopamine
Masking was undoubtedly helped by the fact that subjects had no agonists, repetitive behaviors would be more reinforced by
prior representation to guide visual search, since, contrary to rewarding outcomes than impeded by punishing consequences.
NEUROSCIENCE
absence of dementia [Mini Mental State (MMS) score ⬎25] and depression they could learn them only by observing the outcome, which was displayed at
[Montgomery and Asberg Depression Rating Scale (MADRS) score ⬍20]. Con- the end of the trial. This was a circled coin image (meaning ⫹1€), a barred coin
sequently, average MMS score was 27.7 ⫾ 0.3, average MADRS score was 4.3 ⫾ image (meaning ⫺1€), or a gray square (meaning 0€).
0.8, and Hoenh and Yahr stage was 2.46 ⫾ 0.10 in the ‘‘off’’ state and 2.17 ⫾ The risky response was assigned to Go for half of task completions and to
0.15 in the ‘‘on’’ state. Among the 12 patients, 5 were on levodopa alone, and Nogo for the other half, such that motor aspects were counterbalanced
7 were also taking dopamine receptor agonists. For the sake of simplicity, we between reward and punishment conditions. TS patients and junior controls
converted all medications as levodopa equivalents (Table 3) and we used the were assessed only once and hence performed either the Go or the Nogo
term dopamine enhancers to designate both levodopa and receptor agonists. version of the task. Junior controls were randomly assigned to either the
Every patient was assessed twice, on the morning of 2 different days: once in Go version for one half or the Nogo version for the other half. In TS, the task
the off state, after overnight (⬎12 h) withdrawal of levodopa and a full day version was balanced with respect to the medication status, such that each of
(24 h) withdrawal of dopamine agonists, and once in the on state, 1 h after the four combinations (Off/Nogo, Off/Go, On/Nogo, and On/Go) was admin-
intake of habitual medication dose (levodopa in all patients ⫹ dopamine istrated in the same number of patients (n ⫽ 6). PD patients and senior controls
agonists in 7 of them). One patient included in the study could not complete were assessed twice, once on the Go version and once on the Nogo version. For
the visual discrimination task in the off state due to excessive motor fatigue. senior controls the order of Go and Nogo task versions was simply alternated.
Three patients were unable to perform the conditioning task in the off state In PD, the order was balanced with respect to the medication status, such that
and were therefore not included in the study. each of the four combinations (Off/Nogo–On/Go, Off/Go–On/Nogo, On/
TS patients were consecutive candidates screened for the French Reference Nogo–Off/Go, and On/Go–Off/Nogo) was administrated in the same number
Center for Gilles de la Tourette’s syndrome. Patients were at least 10 years old of patients (n ⫽ 3).
and did not present relevant comorbid conditions (depression, obsessive- The perceptual discrimination task was used as a control for awareness at
compulsive disorder, and/or attention deficit with hyperactivity disorder). the end of conditioning sessions. Hence it was administrated once in TS
Treatment usually cannot be stopped in these patients for ethical reasons: It patients and junior controls and twice in PD patients and senior controls. In
would leave patients in discomfort for too long during washout. However, this task, subjects were flashed two masked cues, 3 s apart, displayed on the
some patients diagnosed with TS remain unmedicated, because their tics do center of a computer screen, each following a fixation cross. As there were 60
Palminteri et al. PNAS 兩 November 10, 2009 兩 vol. 106 兩 no. 45 兩 19183
trials, each cue was presented 40 times, which is more than in conditioning index (d⬘), as the difference between normalized rates of hits (correct differ-
sessions (30 times). Subjects had to report whether or not they perceived any ent responses) and false alarms (incorrect different responses).
difference between the two visual stimulations. The response was given All data (demographic, clinical, or experimental) are reported as mean ⫾
manually, by pressing one of two keys assigned to ‘‘same’’ and ‘‘different’’ between-subjects standard error of the mean (SEM). To assess instrumental
choices. Importantly, subjects had no opportunity to see the cues unmasked, conditioning, we used one-tailed paired t tests comparing individual perfor-
so they could not get any prior information about what these cues look like. mances with chance level (which corresponds to a zero payoff). Similarly, to
Note that the three cues used in the perceptual discrimination control were assess visual discrimination, we compared individual d⬘ with chance level
different from those used in instrumental learning sessions, to avoid subjects (which is also zero), using one-tailed paired t tests. Within each pathological
distinguishing cues on the basis of their learned values. At the end of the condition (PD or TS), we assessed medication effects by comparing dependent
experiment, subjects were debriefed about whether or not they could per- variables between On and Off states. We used within-group comparisons
ceive some piece of cues. They were also shown the cues unmasked one by one (paired two-tailed t tests) for PD patients, who were tested in the two
and asked whether or not they had seen them before. No included subject medication states, and between-group comparisons (unpaired two-tailed t
reported having seen any cue. tests) for TS patients, who were either medicated or not. To assess disease
effects relative to controls we performed between-group comparisons (un-
Statistical Analysis. From the conditioning task we extracted the percentages paired two-tailed t tests). Finally, to assess significance of linear correlation
of Go and risky responses, which can be taken as indirect measures of motor between learning (payoff) and discrimination (d⬘) measures, we calculated
and cognitive impulsivity, respectively. We also extracted the number of Pearson’s coefficients. For all statistical tests the threshold for significance was
correct choices, which is equivalent to the monetary payoff. The payoff can set at P ⬍ 0.05.
then be split into euros won for the reward condition and euros not lost for
the punishment condition. To correct for motor and cognitive bias, we sub-
ACKNOWLEDGMENTS. We are grateful to Helen Bates for helping with
tracted the correct choices made in the neutral condition, which captures the
behavioral task administration and to Virginie Czernecki and Priscilla Van
propensity to make a Go response and a risky choice. To display learning Meerbeeck for providing clinical data. We also thank Arlette Welaratne and
progression, we plotted the cumulative money won (reward learning) or not all of the staff of the Centre d’Investigation Clinique for taking care of
lost (punishment learning) across trials. A linear regression was fitted on these patients. Aman Saleem, Shadia Kawa, and Beth Pavlicek checked the English.
learning curves, and coefficients (betas) were considered as an index of S.P. received a Ph.D. fellowship from the Neuropôle de Recherche Francilien.
learning rates. From the visual discrimination task we calculated a sensitivity The study was funded by the Ecole de Neurosciences de Paris.
1. Rescorla RA, Wagner AR (1972) A theory of Pavlovian conditioning: Variations in the 18. Wilkinson L, Shanks DR (2004) Intentional control and implicit sequence learning. J Exp
effectiveness of reinforcement and nonreinforcement. Classical Conditioning II: Cur- Psychol Learn Mem Cogn 30(2):354 –369.
rent Research and Theory, eds Black AH, Prokasy WF (Appleton-Century-Crofts, New 19. Morris JS, Ohman A, Dolan RJ (1998) Conscious and unconscious emotional learning in
York), pp 64 –99. the human amygdala. Nature 393(6684):467– 470.
2. Sutton RS, Barto AG (1998) Reinforcement Learning. (MIT Press, Cambridge, MA). 20. Olsson A, Phelps EA (2004) Learned fear of ‘‘unseen’’ faces after Pavlovian, observa-
3. Daw ND, Doya K (2006) The computational neurobiology of learning and reward. Curr tional, and instructed fear. Psychol Sci 15(12):822– 828.
Opin Neurobiol 16(2):199 –204. 21. Knight DC, Nguyen HT, Bandettini PA (2003) Expression of conditional fear with and
4. O’Doherty JP, Hampton A, Kim H (2007) Model-based fMRI and its application to without awareness. Proc Natl Acad Sci USA 100(25):15280 –15283.
reward learning and decision making. Ann N Y Acad Sci 1104:35–53. 22. Seitz AR, Kim D, Watanabe T (2009) Rewards evoke learning of unconsciously pro-
5. Schultz W, Dayan P, Montague PR (1997) A neural substrate of prediction and reward. cessed visual stimuli in adult humans. Neuron 61(5):700 –707.
Science 275(5306):1593–1599. 23. Li W, Howard JD, Parrish TB, Gottfried JA (2008) Aversive learning enhances perceptual
6. Waelti P, Dickinson A, Schultz W (2001) Dopamine responses comply with basic and cortical discrimination of indiscriminable odor cues. Science 319(5871):1842–1845.
assumptions of formal learning theory. Nature 412(6842):43– 48. 24. Braak H, Del Tredici K (2008) Invited article: Nervous system pathology in sporadic
7. Pessiglione M, Seymour B, Flandin G, Dolan RJ, Frith CD (2006) Dopamine-dependent Parkinson disease. Neurology 70(20):1916 –1925.
prediction errors underpin reward-seeking behaviour in humans. Nature 25. Pessiglione M, et al. (2005) An effect of dopamine depletion on decision-making: The
442(7106):1042–1045.
temporal coupling of deliberation and execution. J Cogn Neurosci 17(12):1886 –1896.
8. Frank MJ, Seeberger LC, O’Reilly RC (2004) By carrot or by stick: Cognitive reinforce-
26. Voon V, Potenza MN, Thomsen T (2007) Medication-related impulse control and
ment learning in parkinsonism. Science 306(5703):1940 –1943.
repetitive behaviors in Parkinson’s disease. Curr Opin Neurol 20(4):484 – 492.
9. Cools R, Altamirano L, D’Esposito M (2006) Reversal learning in Parkinson’s disease
27. Lawrence AD, Evans AH, Lees AJ (2003) Compulsive use of dopamine replacement
depends on medication status and outcome valence. Neuropsychologia 44(10):1663–
therapy in Parkinson’s disease: Reward systems gone awry? Lancet Neurol 2(10):595–
1673.
604.
10. Frank MJ (2005) Dynamic dopamine modulation in the basal ganglia: A neurocompu-
28. Singer HS (2005) Tourette’s syndrome: From behaviour to biology. Lancet Neurol
tational account of cognitive deficits in medicated and nonmedicated Parkinsonism. J
4(3):149 –159.
Cogn Neurosci 17(1):51–72.
29. Albin RL, Mink JW (2006) Recent advances in Tourette syndrome research. Trends
11. Pessiglione M, et al. (2008) Subliminal instrumental conditioning demonstrated in the
Neurosci 29(3):175–182.
human brain. Neuron 59(4):561–567.
12. Kouider S, Dehaene S (2007) Levels of processing during non-conscious perception: A 30. Leckman JF (2002) Tourette’s syndrome. Lancet 360(9345):1577–1586.
critical review of visual masking. Philos Trans R Soc Lond B Biol Sci 362(1481):857– 875. 31. Wong DF, et al. (2008) Mechanisms of dopaminergic and serotonergic neurotransmis-
13. Dehaene S, Changeux JP, Naccache L, Sackur J, Sergent C (2006) Conscious, precon- sion in Tourette syndrome: Clues from an in vivo neurochemistry study with PET.
scious, and subliminal processing: A testable taxonomy. Trends Cogn Sci 10(5):204 – Neuropsychopharmacology 33(6):1239 –1251.
211. 32. Tarnok Z, et al. (2007) Dopaminergic candidate genes in Tourette syndrome: Associa-
14. Knowlton BJ, Mangels JA, Squire LR (1996) A neostriatal habit learning system in tion between tic severity and 3⬘ UTR polymorphism of the dopamine transporter gene.
humans. Science 273(5280):1399 –1402. Am J Med Genet B Neuropsychiatr Genet 144B(7):900 –905.
15. Shohamy D, et al. (2004) Cortico-striatal contributions to feedback-based learning: 33. Gilbert DL, et al. (2006) Altered mesolimbocortical and thalamic dopamine in Tourette
Converging data from neuroimaging and neuropsychology. Brain 127(Pt 4):851– 859. syndrome. Neurology 67(9):1695–1697.
16. Lagnado DA, Newell BR, Kahan S, Shanks DR (2006) Insight and strategy in multiple-cue 34. Yoon DY, et al. (2007) Dopaminergic polymorphisms in Tourette syndrome: Associa-
learning. J Exp Psychol Gen 135(2):162–183. tion with the DAT gene (SLC6A3). Am J Med Genet B Neuropsychiatr Genet
17. Lovibond PF, Shanks DR (2002) The role of awareness in Pavlovian conditioning: 144B(5):605– 610.
Empirical evidence and theoretical implications. J Exp Psychol Anim Behav Process 35. Schmidt L, et al. (2008) Disconnecting force from money: Effects of basal ganglia
28(1):3–26. damage on incentive motivation. Brain 131(Pt 5):1303–1310.