中文题名: | 基于深度神经网络的动作表征空间探究 |
姓名: | |
保密级别: | 公开 |
论文语种: | chi |
学科代码: | 071101 |
学科专业: | |
学生类型: | 学士 |
学位: | 理学学士 |
学位年度: | 2024 |
校区: | |
学院: | |
第一导师姓名: | |
第一导师单位: | |
提交日期: | 2024-05-30 |
答辩日期: | 2024-05-09 |
外文题名: | THE EXPLORATION OF ACTION SPACE BASED ON DEEP NEURAL NETWORK |
中文关键词: | |
外文关键词: | Deep neural network ; Action space ; Principal component analysis ; X3D network ; HAD dataset |
中文摘要: |
动作在人们的日常生活中无处不在,识别这些动作对我们的生活具有重要意义,那人们是如何对这些动作进行识别的呢?为了探究动作识别过程,我们必须要先明确人类对动作的表征方式。研究表明,人们会将动作组织在一个有多个维度的空间中,即动作空间。过往已有数个研究对人类的动作空间进行了探究,然而这些研究存在一些尚可改进的地方。其主要问题是,这些研究所使用的动作刺激的种类不够丰富,自然性也有待提高,从而使得各个研究得到的动作空间不够全面。因此为了更全面地探索动作空间,我们需要让人类被试观看大量且自然性高的动作视频刺激,并对其大脑进行全方位的记录。但这种实验成本过高且极度耗时,而深度神经网络则能够解决这一问题。深度神经网络对和人脑的神经活动具有类似性,且其能够快速地处理大量视频刺激数据。故本文设计了两个研究,利用负责动作识别的深度神经网络模型,依托大数量、多种类的自然动作视频集HAD,对动作表征空间的结构进行探究。研究一比较了深度神经网络模型和人类对HAD里动作视频的表征不相似性矩阵,进而证明了模型和人类对动作的表征具有显著但不高的相似性。而研究二则提取出了深度神经网络对动作刺激的表征,通过主成分分析的方法提取出表征的六个主成分,从而得到网络的表征空间的数个维度轴。再将主成分与人类标注的语义维度进行相关,以对这些维度轴的含义进行解释。最终发现,动作空间的第一个和第二个维度可以被人类的标注所解释。空间的第一维和情绪唤醒度相关,而第二维则和动作的及物性有关。而其他的四个维度则无法被标注所解释。综上所述,本研究证明了深度神经网络模型与人类在动作表征上的相似性,并发现其中有两个维度能够和人类的行为对应。在未来,我们可以凭借这一发现在人类大脑中寻找对应的功能组织,并为人类的动作空间及神经组织方式提供新的洞察。 |
外文摘要: |
Motion is ubiquitous in people's daily lives, and recognizing these motions is of great significance to our lives. How do people recognize these motions? In order to clarify the process of motion recognition, we must first clarify the way humans represent motion. Research has shown that people organize motion in a multi-dimensional space known as the action space. Several studies have explored the human action space in the past, but there are still some areas that can be improved. The main problem is that the types of motion stimuli used in these studies are not rich enough and their naturalness needs to be improved, which makes the action space obtained from each study incomplete. Therefore, in order to explore the action space more comprehensively, we need to have human subjects watch a large number of high-quality motion video stimuli and record their brain activity comprehensively. However, this experimental method is extremely costly and time-consuming, and deep neural networks can solve this problem. Deep neural networks have similarities with human brain neural activity and can process large amounts of video stimulus data quickly. Therefore, this paper designs two studies using deep neural network models for action recognition, relying on HAD, a natural motion video sets. The first study compares the similarity matrix between deep neural network models and human representations of HAD action videos, proving that the models and humans have low but significant similarities in representing actions. The second study extracts the representation of deep neural network stimuli for action stimuli and gets six principal components from the representation using principal component analysis, and obtains several dimensions of the network's representation space. Then, the principal components are correlated with human-labeled semantic dimensions to explain the meaning of these dimensions. It is found that the first and second dimensions of the action space can be explained by human labeling. The first dimension is related to emotional arousal, while the second dimension is related to the transitivity of actions. The other four dimensions cannot be explained by labeling. In summary, this study proves the similarity between deep neural network models and humans in representing actions, and finds that two dimensions can correspond to human behavior. In the future, we can use this finding to search for corresponding functional organizations in the human brain and provide new insights into human action space and neural organization. |
参考文献总数: | 45 |
作者简介: | 王陈龙,北京师范大学心理学部2020级本科生。 |
插图总数: | 7 |
插表总数: | 0 |
馆藏号: | 本071101/24084 |
开放日期: | 2025-05-30 |