Action prediction error, a value-free dopaminergic teaching signal driving stable associations in the tail of striatum. II: Behavioral and causal evidence
The stimulus-action associations of auditory-discrimination tasks are thought to be learned in the tail of the striatum (TS), where dopamine would modulate synaptic weights to specifically pair proper responses to specific contexts. Classical theories of reinforcement learning, however, cannot account for this as TS-dopamine levels are related to movement and not reward. The function of this value-free dopamine signal is therefore unclear. Here, we propose that dopamine in the TS acts as a teaching signal, providing an action prediction error that establishes stable stimulus-response associations, and representing the value-free component of a dual action-controller model. Striatal innactivations demostrate that TS is specifically required to discriminate frequencies, and that both the direct and indirect pathways influence behavior in an opposite manner. Chronic ablations of the TS impair learning the frequency discrimination task, and ablation of the dopamine cells that project to the TS recapitulate these learning deficits, suggesting that dopamine acts as a teaching signal. Supporting this, manipulation of dopamine levels in the TS biases animals’ choices when this manipulation coincides in time with the action taken, and not with the outcome, consistently with this signal representing an action prediction error. Finally, we show that a dual value-free / value-based action-controller model predicts the effects of accute inhibitions at different stages of learning, with the cell types in the TS having a bigger impact in behavior as learning progresses. All together, our results suggest that dopamine entrains the TS with a value-free signal to store stable stimulus-response associations.