- 无标题文档
查看论文信息

中文题名:

 猕猴初级视觉皮层在奖赏关联学习中的变化规律研究    

姓名:

 张艳歌    

保密级别:

 公开    

论文语种:

 chi    

学科代码:

 04020002    

学科专业:

 02认知神经科学(040200)    

学生类型:

 博士    

学位:

 理学博士    

学位类型:

 学术学位    

学位年度:

 2024    

校区:

 北京校区培养    

学院:

 脑与认知科学研究院    

研究方向:

 神经信息加工与计算    

第一导师姓名:

 邢大军    

第一导师单位:

 心理学部    

提交日期:

 2024-06-05    

答辩日期:

 2024-05-29    

外文题名:

 STUDY ON THE CHANGES OF PRIMARY VISUAL CORTEX DURING REWARD ASSOCIATIVE LEARNING IN PRIMATES    

中文关键词:

 清醒猕猴 ; 初级视觉皮层 ; 瞳孔 ; 基于奖赏的关联学习 ; 多认知因素    

外文关键词:

 awake macaque ; primary visual cortex ; pupil ; reward-based associative learning ; multiple cognitive factors    

中文摘要:

基于奖赏的视觉关联学习是一种十分常见的学习方式,通过将奖励和环境中特定的视觉刺激不断的建立关联,产生经验,能够显著提高人和动物对环境的适应能力,这中间也需要视觉系统的参与。初级视觉皮层(primary visual cortex,V1)作为皮层处理视觉信息的第一站,对视觉信息的传递和感知都至关重要,然而V1在奖赏关联学习中的变化规律尚不清楚。

一方面,针对奖赏关联学习如何影响猕猴V1的反应强度的问题,已有的研究结果不一致。早期核磁研究发现奖赏关联学习会减弱V1的反应,然而最近内源性光学成像的研究则提示奖赏关联学习会增强V1的反应。在这种矛盾的现象之下,虽然有研究发现猕猴V1神经元的反应强弱与视觉刺激关联的奖励大小高度相关,然而奖赏关联学习中V1神经元具体如何变化,以及通过什么样的编码方式实现对不同奖励大小的刺激的差异性表征尚不清楚。

另一方面,奖赏关联学习中导致V1反应发生变化的认知因素以及其神经机制尚不明确。目前主流的观点认为造成V1反应变化的神经机制存在多种可能,有可能是V1内部的局部连接改变导致的,也有可能是其他脑区(与价值评估相关的额顶叶等更高级脑区或者与奖赏相关的皮下核团等)对V1的影响改变导致的。奖赏关联学习中V1反应变化与上述哪一种,或者哪几种神经机制相关,以及涉及到哪些认知因素,都仍需要实验支持。

为了回答上述问题,本研究采用动物实验模型,通过训练猕猴在巴浦洛夫实验范式下完成不同朝向的视觉刺激与奖励水平的关联,结合动物非自主行为(瞳孔)和V1电生理记录,进行了以下三部分研究:

第一部分,我们对奖赏关联学习中V1反应的变化规律进行了研究。我们发现动物在完成奖赏关联学习后,高奖励关联的视觉刺激(HRS)诱发的V1神经元发放率比低奖励关联刺激(LRS)更强,这种差异随着学习过程逐渐增大,增强了V1神经元群体对HRS的表征。接着我们刻画了一个实验试次(trial)内V1神经元的动态反应,结果提示奖赏关联学习中V1的反应可能受到多种因素的影响,包括作用在刺激呈现早期的快抑制成分,以及作用在刺激呈现中、晚期的慢抑制成分和慢兴奋成分。快抑制成分无刺激选择性和群体差异,在注视时期作用较强,导致学习时期V1的反应略强于注视时期。慢抑制成分无刺激选择性和群体差异,减弱了学习时期所有刺激引发的所有神经元群体的反应。慢兴奋成分有刺激特异性和群体特异性,增强了学习时期HRS呈现诱发的V1反应,其中偏好朝向与HRS相同的神经元群体反应增强的更多。在慢抑制成分和慢兴奋成分的共同影响下,学习时期(相对注视时期),HRS和LRS呈现中期诱发的V1平均反应均大幅度降低(我们将这个下降特征标记为V1 decrease),HRS呈现晚期诱发的V1平均反应强于LRS(我们将这个差异特征标记为V1 difference);对于不同的神经元群体来说,学习时期(相对注视时期),LRS呈现中、晚期,不同神经元群体反应下降的幅度无差异;HRS呈现中期所有神经元反应的下降,其中偏好朝向与HRS不一致的神经元反应下降的更多;HRS呈现晚期仅偏好朝向与HRS不一致的神经元反应下降,偏好朝向与HRS一致的神经元反应则略有增强。我们推测刺激呈现早期V1反应的微弱增强是由于V1内部的局部连接改变导致的,而刺激呈现中、晚期V1反应的变化可能是由于其他脑区的变化对V1的影响导致的。

为了进一步探索刺激呈现中、晚期V1反应的变化与奖赏关联学习中什么认知因素相关,我们进行了第二部分和第三部分的研究。

第二部分,我们试图从瞳孔反应中拆解奖赏关联学习中涉及到的不同认知因素。通过对瞳孔动态反应的刻画,我们发现在长期的学习过程中,瞳孔反应的逐渐变化反映了动物多种认知状态的改变。我们成功地从瞳孔反应中分离出多个因素,包括觉醒程度(BL)、注视漂移(FDE)、视觉刺激(SE)、奖励状态(RSE),以及奖励期待(REE)等。通过进一步刻画这些影响因素在关联学习中的变化,我们发现有两个成分与学习过程高度相关,其中RSE成分仅仅与任务规则相关,而REE成分不仅与任务规则相关,也与学习内容相关。

第三部分,我们探索了V1中、晚期的反应变化特征(V1 decrease和V1 difference)与瞳孔反应中多种认知因素的关系。V1 decrease表征了刺激呈现中期神经元反应中无视觉刺激选择性的降低,随着学习过程降低的程度逐渐增加;V1 difference表征了刺激呈现晚期神经元反应中对不同视觉刺激(HRS和LRS)反应的差异,随着学习过程逐渐增大。我们发现V1的这两个特征与第二部分研究中瞳孔反应中的2个认知成分高度相关,其中V1 difference与瞳孔反应中的奖励期待成分(REE)相关性更强,V1 decrease则与瞳孔反应中的奖励状态成分(RSE)相关性更强。然而V1的两个特征在学习中的变化并不相关,这意味着在刺激呈现的中期和晚期,V1反应的变化可能受到两种不同的神经机制的影响。

综上,本研究结合电生理记录、瞳孔记录以及计算建模,深入研究了在长时程(2-3个月)的奖赏关联学习中猕猴V1神经元的动态反应变化规律以及可能的神经机制。本研究为以往关于奖赏关联学习如何影响V1反应强度的矛盾的实验结果提供了电生理层面的证据,并从神经元群体出发解释了V1神经元对高奖励刺激(HRS)的表征增强的原因。本研究弥补了前人关于奖赏关联学习中V1反应变化的研究中的不足,增加了对瞳孔动态反应的理解,提供了新的实验证据支持奖赏关联学习中V1反应变化可能来源于多种环路(受到了额顶叶高级脑区的反馈和/或皮下核团的调制),这将帮助我们更好地理解奖赏关联学习对视觉信息加工和基础感知觉的影响。

外文摘要:

Reward-based visual associative learning is a highly common learning method whereby associations are continuously formed between rewards and specific visual stimuli in the environment, leading to experiential acquisition. This process significantly enhances the adaptability of both humans and animals to their environment, requiring the involvement of the visual system. The primary visual cortex (V1), serving as the first cortical station for processing visual information, is crucial for the transmission and perception of visual information. However, the patterns of V1 changes during reward-based associative learning remain unclear.

On one hand, existing research results regarding how reward-based associative learning affects the response intensity of the V1 in macaques are inconsistent. Early fMRI studies found that reward-based associative learning would weaken the response of V1. However, recent optical imaging studies suggest that reward-based associative learning would enhance the V1 responses. Amidst this contradictory phenomenon, although some studies have found a strong correlation between the response intensity of macaque V1 neurons and the size of rewards associated with visual stimuli, it remains unclear how V1 neurons change specifically with differential representations for stimuli related to reward sizes during reward-based associative learning.

On the other hand, the cognitive factors and the neural mechanisms underlying the changes in V1 responses induced by reward-based associative learning remain unclear. The prevailing view suggests that multiple potential mechanisms could contribute to the alterations in V1 responses during reward-based associative learning. These mechanisms might involve changes in local connections within V1 or influences from other brain regions (such as higher-order cognitive areas like the prefrontal cortex related to value assessment, or subcortical structures related to reward states). However, which specific neural mechanisms, or combinations thereof, and cognitive factors are related to changes in V1 responses during reward-based associative learning still require experimental support.

To address the aforementioned questions, this study employed an animal experimental model. Monkeys were trained in a Pavlovian paradigm to associate visual stimuli of different orientations with varying levels of reward. The study comprised three main components, integrating measurements of animal involuntary behavior (pupillary response) and V1 electrophysiological recordings.

In the first part, we investigated population response patterns in V1 change during reward-based associative learning. We found that following completion of reward-based associative learning, neurons in V1 exhibited higher firing rates in response to visual stimuli associated with high rewards (HRS) compared to those associated with low rewards (LRS). This difference in neuronal response increased gradually throughout the learning process, enhancing the representation of HRS within the population of V1 neurons. Subsequently, we characterized the dynamic responses of V1 neurons within individual experimental trials. Our results suggest that V1 responses during reward--based associative learning may be influenced by multiple factors, including rapid inhibitory components acting early in stimulus presentation, as well as slow inhibitory and excitatory components acting later in stimulus presentation. The rapid inhibitory component showed no stimulus selectivity or population differences and had a stronger effect during the fixation period, resulting in slightly stronger V1 responses during the learning phase compared to the fixation phase. The slow inhibitory component exhibited no stimulus selectivity or population differences, attenuating the responses of all neuronal populations to stimuli presented during the learning phase. The slow excitatory component showed stimulus and population specificity, enhancing V1 responses to HRS presented during the learning phase, with a greater enhancement observed in neuronal populations preferring stimuli similar to HRS. Under the combined influence of slow inhibitory and excitatory components, during the learning phase (relative to the fixation phase), the average V1 responses to mid-term presentations of both HRS and LRS decreased significantly (referred to as V1 decrease), with stronger average responses to late-term presentations of HRS compared to LRS (referred to as V1 difference). Across different neuronal populations, during the learning phase (relative to the fixation phase), there was no differential decrease in responses to mid- and late-term presentations of LRS; however, for mid-term presentations of HRS, neurons preferring stimuli incongruent with HRS exhibited a greater decrease in response. For late-term presentations of HRS, only neurons preferring stimuli incongruent with HRS exhibited a decrease in response, while neurons preferring stimuli congruent with HRS showed a slight enhancement in response.We speculate that the slight enhancement in V1 responses during the early stimulus presentation period may be due to changes in local connections within V1, while changes in V1 responses during the mid and late stimulus presentation periods may be attributed to alterations in other brain regions affecting V1.

To further explore the cognitive factors related to the changes in the V1 response in the mid and late stimulus presentation periods during reward-based associative learning, we conducted the second and third parts of the study.

In the second part, we aimed to dissect various cognitive factors from reward-based associative learning by using pupillary responses. Characterizing the dynamics of pupil responses during reward-associated learning, we found that the gradual changes in pupil size reflected alterations in multiple cognitive states throughout the extended learning process. We successfully isolated several factors from pupillary responses, including arousal level (BL), fixation drift (FDE), stimulus-evoked (SE), reward state (RSE), and reward expectation (REE). By further elaborating on the variations of these influencing factors throughout the entire learning process, we observed two components highly correlated with the learning process. Specifically, the RSE component was solely related to task rules, whereas the REE component was not only associated with task rules but also with the learning content.

In the third part, we explored the relationship between the characteristics of V1 response changes at late-stage (V1 decrease and V1 difference) and various cognitive factors in pupil responses. V1 decrease reflected a response reduction to all visual stimuli during mid-stimulus presentation in the learning process. V1 difference represented the response difference to different visual stimuli (HRS and LRS) during late-stage stimulus presentation, which increased gradually throughout the learning process. We found that these two features in V1 were highly correlated with two cognitive components in pupil responses found in the second part of the study. Specifically, V1 difference exhibited a stronger correlation with the reward expectation component (REE) in pupil responses, while V1 decrease showed a stronger correlation with the reward state component (RSE) in pupil responses. However, the changes in these two features in V1 during learning were not correlated, suggesting that the changes in V1 responses during mid and late stimulus presentation might be influenced by different neural mechanisms.

In conclusion, this study, through the integration of electrophysiological recordings, pupil recordings, and computational modeling, delved into the dynamic response changes in macaque V1 neurons and their potential mechanisms during long-term (2-3 months) reward-based associative learning. By providing electrophysiological evidence, this study addresses the contradictory experimental results regarding how reward-based associative learning influences V1 response, and from a population of neurons perspective, explains the reasons for the enhanced representation of high reward associated stimuli (HRS) by V1 neurons. This study fills gaps in previous research on changes in V1 responses during reward-based associative learning, enhances understanding of dynamic pupil responses, and offers new empirical evidence supporting the notion that the primary reasons for changes in V1 responses during reward-based associative learning may be feedback from higher-order cortical areas such as the frontal and parietal lobes and/or modulation from subcortical nuclei. This will aid in better understanding the impact of reward-based associative learning on visual information processing and fundamental perception.

参考文献总数:

 127    

馆藏地:

 图书馆学位论文阅览区(主馆南区三层BC区)    

馆藏号:

 博040200-02/24004    

开放日期:

 2025-06-05    

无标题文档

   建议浏览器: 谷歌 360请用极速模式,双核浏览器请用极速模式