A feature-specific prediction error model explains dopaminergic heterogeneity

biorxiv(2024)

引用 13|浏览13
暂无评分
摘要
The hypothesis that midbrain dopamine (DA) neurons broadcast an error for the prediction of reward (reward prediction error, RPE) is among the great successes of computational neuroscience1-3. However, recent results contradict a core aspect of this theory: that the neurons uniformly convey a scalar, global signal. For instance, when animals are placed in a high-dimensional environment, DA neurons in the ventral tegmental area (VTA) display substantial heterogeneity in the features to which they respond, while also having more consistent RPE-like responses at the time of reward4. We argue that the previously predominant family of extensions to the RPE model, which replicate the classic model in multiple parallel circuits, are ill-suited to explaining these and other results concerning DA heterogeneity within the VTA. Instead, we introduce a complementary "feature-specific RPE" model positing that DA neurons within VTA report individual RPEs for different elements of a population vector code for an animal's state (moment-to-moment situation). To investigate this claim, we train a deep reinforcement learning model on a navigation and decision-making task and compare the feature-specific RPE derived from the network to population recordings from DA neurons during the same task. The model recapitulates key aspects of VTA DA neuron heterogeneity. Further, we show how our framework can be extended to explain patterns of heterogeneity in action responses reported among SNc DA neurons5. Thus, our work provides a path to reconcile new observations of DA neuron heterogeneity with classic ideas about RPE coding, while also providing a new perspective on how the brain performs reinforcement learning in high dimensional environments. ### Competing Interest Statement The authors have declared no competing interest.
更多
查看译文
关键词
prediction,heterogeneity
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要