查看论文信息

查看全文

查看论文信息

中文题名：	物体形状表征的认知机制研究
姓名：	陈灵娟
保密级别：	公开
论文语种：	chi
学科代码：	04020002
学科专业：	02认知神经科学（040200）
学生类型：	博士
学位：	理学博士
学位类型：	学术学位
学位年度：	2024
校区：	北京校区培养
学院：	心理学部
研究方向：	认知心理学
第一导师姓名：	毕彦超
第一导师单位：	心理学部
提交日期：	2024-06-24
答辩日期：	2024-05-28
外文题名：	COGNITIVE MECHANISMS OF OBJECT SHAPE REPRESENTATION
中文关键词：	形状表征 ; 触觉 ; 视觉 ; 多模态 ; 盲人 ; 概念
外文关键词：	Shape representation ; Touch ; Vision ; Multimodality ; Blind ; Concept
中文摘要：	︿概念知识到底是从多感官/运动（具身化）还是从语言来的？神经科学的实证主义认为是从具身而来，与人脑中视觉、听觉和运动等通道特异性基本信息处理组织原则对应。另一方面近年OpenAI的成功促使人们意识到语言对于世界结构的抽象和压缩的强大能力。在正常人脑中，这两者很难区分。毕竟，语言主要用于描述我们所看到、听到和感知到的信息，而多个感知通道之间也存在强烈的关联性，不同渠道获得的信息内容结构也高度相关。由于通常情况下跨渠道的相关性，我们实际上并不知道这些渠道构建的知识空间是否完全相同。如果不相同的话，那么人类到底是从哪个信息渠道来构建知识的呢？形状作为识别物体的一个特别基本的特征，是我们感知和理解世界的重要信息之一。我们可以通过视觉看到形状，也可以通过触觉摸到形状，还可以通过语言获得形状的知识，尽管语言中描述形状的词汇相对较简单，但词汇之间可能嵌入了许多复杂的形状信息（存在许多隐藏的结构）。在正常大脑活动中，我们的视觉皮层和腹侧通路以及背侧通路都能解码物体的形状信息，甚至在完全没有视觉经验的先天盲人中，相关皮层也能解码形状信息。虽然越来越多的证据表明视觉和触觉形状之间存在共享的认知和神经表征空间，但以前的研究往往倾向于依赖物体之间的不同结构，并且没有研究在视觉经验缺失的情况下形状表征的详细属性。所以如果没有视觉，人脑对于形状的表征到底是怎样的？通过触摸形成的形状表征和通过视觉获得的形状表征是否完全一样？如果无法触摸，形状表征又会是什么情况？语言能够描述到什么程度？为了回答以上问题，我们选取了先天盲人和视力正常参与者都非常熟悉但具有不同触觉体验的三类物体（工具、大型物体和动物），在两组人群中进行了关于物体形状知识产出实验（语言特征报告、3D粘土模型捏制和2D绘画），同时我们选取了两组人熟悉但触觉体验更少或几乎没有且具有不同视觉重要性的非物体具体概念（天气和场景），在两组人群中进行了2D绘画实验。采用特征编码、语言大模型文本分析、相似性主观评估、计算视觉、多维尺度分析（MDS）等方法，通过四个研究，从语言特征的形状属性、形状知识生成的质量以及个体一致性等角度揭示视觉、触觉以及语言对具体物体和非物体具体概念形状表征的影响。研究一主要关注视觉缺失对物体语言特征形状空间的影响。对先天盲人和视力正常参与者报告的物体特征，利用手动编码对特征类型、利用AI语言大模型对语言特征的形状属性分别进行量化，然后对特征类型的分布、与形状词的距离以及个体一致性进行分析。结果发现，只在多模态感官特征上盲人组显著多于视力组，视力组产生的特征比盲人组更接近形状属性，但是对于工具，盲人组的个体一致性比视力组更高，而大型物体则刚好相反，对于动物，两组没有差异且个体一致性都较低，语言形状特征的空间可能不会完全免疫视觉缺失。研究二通过直接的形状外显实验（3D模型捏制）探索两组参与者对不同触觉经验程度物体的形状表征。对两组参与者生成的3D模型，利用模型命名任务和目标相似性评定任务对模型质量进行评估，同时采用计算视觉的方法对个体间表征一致性进行分析，并且使用MDS进一步可视化形状表征空间。结果发现，缺乏视觉经验导致动物3D模型的制作质量较差，但对于工具和大型物体在群体平均水平上没有受到影响（工具上都较好，大型物体上都较差），然而在工具上盲人组比视力组的个体差异更大，且盲人制作的工具模型更短粗，缺乏视觉经验确实导致了可操作物体的特异性改变。研究三通过直接的形状外显实验（2D绘画）探索缺乏视觉经验如何影响物体的形状表征。对先天盲人和视力正常参与者生成的画作，利用画作命名任务和目标相似性评定任务对画作质量进行评估，利用多重排列任务对个体间表征一致性进行分析。结果发现，缺乏视觉经验对触觉经验更少的物体（动物）的绘制效果影响最大（与研究二结果一致），对于具有丰富触觉/操作经验的工具，盲人组的绘画质量与视力正常组相似，说明视觉经验至少有助于工具形状的对齐，盲人和视力正常人的组内个体差异水平相当，组间存在系统差异，但无法排除该差异是由于视觉经验还是绘画经验导致。研究四以非物体的具体概念（视觉主导或多通道感知的天气和场景）为实验材料，检验关于物体形状的研究发现是否适用于其他更广泛的、不限制于特定传统物体形状的概念知识。考虑到在两组中实施黏土造型任务存在困难，本研究采用的是2D绘画任务，同样从画作质量和个体间表征一致性两个角度进行评估。结果发现，缺乏视觉经验影响非物体具体概念的形状产出质量，然而即使具有不同触觉经验，两类概念（视觉主导和多通道感知）画作质量却不存在显著差异且都较差（除了个别语言描述有清晰单一形状的概念外），也就说对于非物体概念的形状表征，是否具有语言可描述表达的形状可能变得更重要。综合上述四个形状知识生成研究的结果，我们发现即使形状这样一个经典被认为超模态的特性，它的表现也反映了视觉、触觉和语言的复杂交互与整合，在人脑智能中，这些信息被综合利用，共同构建形状的内在模型表征。这些研究发现不但对形状知识表征的通道性特点有所回答（多个通道共同约束的知识系统），也对更普遍性的概念知识表征当前理论的不足有所提示，并且对当前机器智能的发展和视障、听障等特殊人群的辅助康复也有所启示。﹀
外文摘要：	︿ Is conceptual knowledge derived from sensory/motor experience (embodied) or from language? Empirical neuroscience suggests that it is derived from sensory embodiment, corresponding to the processing and organization principles of basic information which is modality-specific in human brain such as vision, hearing, movement and so on. On the other hand, the success of OpenAI in recent years has prompted people realize the powerful ability of language to abstract and compress the structure of the world. In the normal human brain, it is difficult to distinguish between the two. After all, language is mainly used to describe the information we see, hear, and perceive, and there is a strong correlation among the sensory modalities, and the content and structure of the information obtained from different modalities is also related. Due to the correlation of across modalities in general, we don't know whether the knowledge spaces constructed by these modalities are exactly the same. If they are different, then which information channel do humans use to construct knowledge? Shape, as a particularly basic property for identifying objects, is one of the important pieces of information for us to perceive and understand the world. We can see shapes through vision, touch shapes through tactile, and obtain knowledge of shapes through language. Although the vocabulary used to describe shapes in language is relatively simple, there may be a lot of complex shape information embedded between the vocabulary (many hidden structures). In normal human brain, even in congenitally blind (no visual experience) people’s brain, our visual cortex, ventral pathway, and dorsal pathway can all decode shape information of objects. Although there is mounting evidence for the shared cognitive and neural representation space between visual and tactile shape, previous research tended to rely on dissimilarity structures between objects and had not examined the detailed properties of shape representation in the absence of vision. So, if there is no vision, how does the human brain represent shapes? Are the shape representations formed through touch the same as those obtained through vision? What would happen to shape representations if touch were not possible? To what extent can language play a role in shape representations? To answer these questions, we selected three domains of familiar objects with varying levels of tactile exposure, including tools, large nonmanipulable objects, and animals, and conducted three explicit object shape knowledge production experiments (verbal feature generation, clay modelling with Play-Doh, and drawing) with congenitally blind and sighted participants. In another experiment, we selected two categories of familiar non-object concrete concepts with less or almost no tactile exposure (fig-weather/fig-scenes, and nonfig-weather/nonfig-scenes), and asked the participants to produce 2D drawings of these concepts. By using multiple methods, such as feature coding, large language model text analysis, subjective similarity evaluation, computational vision, and multidimensional scaling analysis (MDS), we conducted four studies to reveal the effects of vision, touch, and language on the shape representation of objects and non-object concrete concepts from the perspectives of shape attributes of language features, general quality of shape knowledge production, and inter-subject consistency. Study 1 mainly focused the impact of vision absent on the shape space of language features. We categorized the generated features by manually coding, and quantified the shape attributes of language features using an AI language model. Then we did a series of analyses about the distribution of feature types, distance from shape words, and inter-subject consistency. We found that the blind group generated more multimodal sensory features than the sighted group, and the features generated by the blind people were less strongly linguistically associated with shape features than those by the sighted group. But for tools, the blind group had higher consistency between individuals than the sighted group, while for large objects, it was the opposite. For animals, there was no difference in inter-subject consistency between the two groups. These findings indicate that the space of verbal shape features may not be fully immune to vision absence. Study 2 explored the shape representation of objects with different levels of tactile exposure by a direct shape manifestation experiment (3D model making). We used a subjective evaluation method to measure the goodness of the models, and a computer graphics approach to measure the (dis)agreement within and across subject groups for the models. We further visualize the shape representation space for each domain objects using the method of MDS. The results showed that the absence of visual experience led to poorer abilities to make 3D models for animals and did not affect the abilities for tools and large objects on the group average level. However, the blind subjects had greater variations in their 3D modelling for tools relative to the sighted, and the tool models made by the blind group were shorter. The lack of visual experience did lead to specific changes in the operable of objects. Study 3 explored how the absence of visual experience would impact participants’ production of 2D shapes for objects by another shape manifestation experiment (2D drawing). In this study, we recruited an independent group of raters to name and score the drawings to quantify the goodness of the drawings, and used the multi-arrangement task to see whether blindness affects the inter-subject alignment in object representation. The results showed that the lack of visual experience had the strongest impact on how well they drew animals. For tools with rich tactile/manipulation experiences, the blind group drew with similar quality as the sighted group, indicating that vision helps to align the shapes at tools. The level of individual differences between blind and sighted group was comparable, and there were systematic differences between groups, but it is difficult to determine whether these differences are due to visual experience or artistic experience. Study 4 used non-object concrete concepts with different visual importance as experimental materials to examine whether the findings about object shapes were applicable to more general conceptual knowledge that was not limited to specific traditional object shapes. We asked participants to complete a drawing task, and analyzed the painting quality and inter-subject consistency. The results showed that the lack of visual experience affected the quality of shape production for non-object concrete concepts. Even with different tactile experiences, there was no significant difference in the drawing quality of the two types of concepts. This means that for the shape representation of non-object concrete concepts, whether it had explicit shape that can be described by language may become more important. In summary, we found that even for such a classic supramodal property - shape, its representation reflects the intricate orchestration of vision, touch and language. In human brain intelligence, these pieces of information are utilized together to construct an internal model representation of shape. These findings not only provide some answers to the modality-specific of shape representation, but also highlight the limitations of current theories of general concept knowledge representation，and also have broader implications for the development of current machine intelligence and the neurorehabilitation of people with visually impaired or hearing impaired. ﹀
参考文献总数：	171
馆藏地：	图书馆学位论文阅览区（主馆南区三层BC区）
馆藏号：	博040200-02/24016
开放日期：	2025-06-25

附件下载