The salt experiment cleanly dissociated incentive wanting from cached predictions gained by previous learning, without requiring serial CSs, because the previously learned CS value was negative and was dynamically reversed into positive valence at test by a natural specific appetite. It would similarly apply to univalent downshifts in incentive value from good to less good or neutral (e.g., physiological satiety or dopamine suppression). This vector's magnitude(8)represents the extent to which the neuron's firing rates, x, y, z, are differentially modulated by the three types of stimuli (CS1, CS2, and UCS), or in other words, it represents the variance of responses across the stimuli. Our additive model (Eqn 3b) can best capture these results, presuming that the motivation valence of incentive salience wanting reverses from negative to positive (similarly to how hedonic liking for intense salt reverses from disliking during the appetite state). Psychological counterparts include cognitive maps of goal outcomes, with values obtained by episodic memories of experience with those goals in states similar to current conditions, and understanding of act-outcome relationships needed to obtain those goals [17], [18], [89], [92][94]. Still, the point of this experiment was to show that the CS revaluation could occur as predicted by our model without experiencing the new UCS liking or wanting values, because rats were not allowed to taste the salt UCS in the new state until after the crucial test with the CSs. This is designated as the prediction or TD error-coding area. This shift corresponds to our motivational transform computation model, and to the idea that mesolimbic stimulations enhanced . Crucially, for showing the dynamic nature of the incentive increase, we note that the enhancements of neural firing to CS2 produced by amphetamine and by drug sensitization were evident right away on the very first presentations of the CS2 in the new sensitization and/or amphetamine state. The computation is such that this angular value indexing a response profile exists in a continuum which 1) exhausts all possible firing patterns (i.e., relative orders in firing rates to these three types of stimuli); and 2) guarantees that nearby values represents similar firing patterns. A special case of incentive salience modulation is incentive-sensitization: this occurs when drugs in the brain sensitize mesolimbic dopamine-related systems, and similar but temporary elevation of wanting can be produced by directly injecting amphetamine before a test [8], [13], [72][74]. Cached values are relatively stable, and able to produce the same optimal behavior across a wide range of homeostatic/motivational states. See Eqn (3a,3b) of the manuscript. The temporal discount factor <1 and sensitization manipulation produces >1, thus . ), with <1 representing devaluation (such as satiation), and >1 representing enhancement (such as appetite or sensitization). [14][15][16] The reinforcing and motivational salience-promoting effects of amphetamine are mostly due to enhanced dopaminergic activity in the mesolimbic pathway. That is, as predicted by our model, the incentive value of the CS2 was dynamically increased without need of re-learning about CS-UCS association, and without additional pairings with the UCS in the transformed state [13]. How does our model (Eqns 3; 3a; 3b) contrast to a cached-value TD model or to a flexible tree-search model involving a cognitive state space? Such dynamic enhancement of CS incentive salience is also consistent with other behavioral demonstrations of incentive motivation enhancement by pharmacological dopamine activation or by psychostimulant-induced neural sensitization [72],[73],[79],[80], even in the absence of additional learning [81][83]. By contrast, incentive-coding implies maximal firing to CS2 that has the greatest motivational impact as it immediately precedes the actual reward. Such re-ordering fails to describe what happens in empirical cases of valence reversal, where the originally most liked reward may often still remain the best of a bad lot, becoming the least disliked as a physiological manipulation changes the valence of the entire group. The effects of the various mesolimbic dopaminergic activations can be visualized as the rotation of the Population Profile Vectors (Figure 4). Our prediction arises because in the serial Pavlovian conditioning paradigm, r1=0 and r2=r, and Eqn (3a) indicates that , . Consider that cue-triggered wanting shoots up upon presentation of a CS, but importantly, also goes down again nearly as soon as the CS is taken away. When =1 our model reduces to the conventional temporal difference model; that is, when physiological state remains constant across training and test. In all cases, the rats had not yet retasted actual salt UCS when they showed new wanting of the CS. By contrast, our gain-control model of incentive salience (Eqn 3) posits that any mesolimbic activation by psychostimulant sensitization or by acute amphetamine administration will immediately modulate the neuronal coding of a signal that carries high incentive salience for a previously learned CS.

In these cases, is a gain-control factor that scales up (i.e., magnifies, when >1) or down (i.e., shrinks, when <1) the incentive salience of the reward. In the following, we will first propose a model for incentive salience that can incorporate dynamic modulation of cue-triggered wanting by even novel physiological states. hunger, thirst, salt appetite, etc. Taken at face value, wanting, if represented linearly in VP firing, might merely change from zero to high during a particular state, without reversing valence. Here is a generic two variable function which, in the following discussions, will be specialized to either of two forms (sub-types), as described below.(3a)(3b). The crucial observation in the electrophysiological results was that in the new salt appetite state, the salt CS now elicited a high level of firing that was equal to or even higher than the sucrose CS in the salt appetite state [12]. The dynamic elevation in firing pattern to a salt CS (Figure 2) indicates that the change in physiological state produced a dynamic elevation of incentive salience value of the relevant previously-trained CS. Experimental design of the serial CS1/CS2/UCS procedure, and effects of sensitization and amphetamine on neuronal firing profiles in ventral pallidum (A). Note that the incentive value of a state st is the motivationally-modulated value of the immediate reward rt plus the discounted value of the expected reward in the next state st+1; both these are loaded into the goal representation as st is presented. Ordinarily, wanting and wanting act together to guide behavior toward the same goals, with incentive salience serving to add motivation oomph to cognitive representations. In particular, the incentive salience mechanism is especially compatible with transient peaks in wanting being tied to CS presence because the Bindra-Toates rules of Pavlovian motivation specify that a synergy exists between CS presence and current mesolimbic state, which controls the intensity of motivation at each moment [1][5],[8],[96]. [4], Incentive salience is a cognitive process that confers a "desire" or "want" attribute, which includes a motivational component, to a rewarding stimulus. What is the role of dopamine in reward: hedonic impact, reward learning, or incentive salience? Coupling to CS is evident in behavioral PIT experiments, where lever-pressing peaks fade away as soon as the CS is removed (see figure 4) even though the salt appetite, dopamine drug, or sensitization state that enhanced the cue's motivation-eliciting power persist. Each appetitive system would specifically modulate the values of its own UCS array (e.g., salt is the UCS reward for salt appetite) and simultaneously modulate most powerfully specific CSs associated with its UCSs. For example, sensitization or an increased physiological appetitive state ( becomes greater than 1) might lead to a decrease in the temporal horizon [60], producing sharper temporal discounting effects, such that motivational value increases with degree of temporal proximity to reward UCS [61]. Neither baseline levels of lever pressing rates nor neuronal firing rates were reliably enhanced at moments in between cues. For example, divergence can lead to irrational wanting in addiction for a target that the individual does not cognitively want, nor predictively expect to be of high value [7][9]. Only with relearning (i.e., post-sensitization learning about surprising UCS) could the temporal difference prediction error signal reboost incentive salience attributions to the memory representation of the prior conditioned stimuli. Enhancements were caused by relevant physiological changes, such as natural salt appetite, and addiction-related amphetamine intoxication and long term sensitization that modulate brain mesolimbic systems involving dopamine. To the extent that computational models aim to capture real psychological cognition, retasting may remain an important feature of many model-based systems [17],[18],[21],[22],[28],[39],[40],[44]. Recently, Daw, Niv and Dayan have explicitly proposed an alternative tree model that can update without needing to retaste, using feedforward recalculation of goal value in a full look-ahead tree even before the goal is experienced [39],[95]. One reason for our current model to treat as a multiplicative or additive parameter is that we wish to strongly distinguish incentive salience as a motivation value which integrates learning and physiological inputs from the stable, purely-learned and cached value V per se (or for that matter even Q value). Ordinarily neurons in ventral pallidum that code CS for rewards fire to onset of an auditory tone CS that previously predicted infusion of sucrose solution into the rat's mouth (right column) but not to a CS for intense salt solution (A). Data from [13]. The cached value-function of each state (CSs) would not have been adjusted until after the next encounter with primary reward (UCS); the UCS itself would be immediately modulated by the physiological shift but would still need to be presented to effect a re-evaluation of CS. By contrast, incentive salience is modeled here with the feature of being able to globally modulate the on-line evaluation of previously-learned values of a primary reward evoked by a CS. Such a model might be able to accomplish revaluations of CS value prior to UCS retasting such as those demonstrated in our experiments. Wrote the paper: JZ KCB JWA. Those tests were conducted both in the absence and in the presence of an acute dose of amphetamine, on different days (see Figure 3 for the experimental design and timeline). No, Is the Subject Area "Dopamine" applicable to this article? The results of the amphetamine and sensitization experiment revealed that VP neurons ordinarily signalled best the prediction value of a CS, responding maximally to CS1 (Figure 3).