查看论文信息

查看全文

查看论文信息

中文题名：	基于Transformer深度学习的运动层次化重定向方法
姓名：	刘琪
保密级别：	公开
论文语种：	chi
学科代码：	080901
学科专业：	计算机科学与技术
学生类型：	学士
学位：	工学学士
学位年度：	2024
校区：	北京校区培养
学院：	人工智能学院
第一导师姓名：	王醒策
第一导师单位：	人工智能学院
提交日期：	2024-06-13
答辩日期：	2024-05-21
外文题名：	Transformer Based Motion Hierarchical Re-targeting Method
中文关键词：	运动重定向 ; 深度学习 ; 注意力机制
外文关键词：	Motion retargeting ; Deep Learning ; Transformer
中文摘要：	︿线上虚拟人手语教学平台的意义重大。在构建虚拟教师的角色动画时，需要确保实际教师的动作与虚拟教师的动作一致，而运动重定向技术是实现这一目标的基础。然而，传统运动重定向技术无法兼顾全身骨骼与手指骨骼的整体一致性，并且通常要求源骨架和目标骨架具有相同数量的关节和相同的拓扑结构。本项目利用深度学习的Transformer技术，实现运动重定向方法，关键思想是将身体部位视为基本的重定向单元。本文的创新性工作主要有： 1）利用基于身体部位的重定向策略解决运动层次化重定向问题。将身体拆分成具有交集的运动结构单元进行分析，实验证明了该方法在同拓扑结构和异拓扑结构骨架上的有效性。 2）通过深度神经网络自动构建不同拓扑结构骨架之间的共享潜在空间，解决了整体运动重定向的拓扑不一致问题。将不同拓扑骨架的运动数据转化到共享潜在空间，并通过注意力机制实现对源运动特征的编码。实验证明了该方法的精确性。 3）利用Transformer自注意网络实现动态运动建模，该模型结合身体部位策略，能够更精确地进行全身（含手部骨骼）动作重定向。通过姿势感知注意力网络动态预测每个身体部位内的关节权重，并通过特征汇聚构建每个身体部位的共享潜在空间，表现出更好的灵活性和适应性。实验证明，本模型在运动重定向中具有稳定性、准确性和可拓展性。﹀
外文摘要：	︿ The significance of online virtual sign language teaching platforms is substantial. When constructing the animation of virtual teachers, it is essential to ensure consistency between the actions of real teachers and those of virtual teachers, with motion redirection technology serving as the foundation to achieve this goal. However, conventional motion redirection techniques often fail to address the overall consistency between full-body and finger skeletal structures and typically require source and target skeletons to possess identical joint counts and topological structures. This project utilizes Transformer technology in deep learning to implement motion redirection methods, with the key concept of viewing body parts as fundamental redirection units. The innovative contributions of this work are primarily: 1) Addressing hierarchical motion redirection challenges using a body-part-based redirection strategy. The body is divided into intersecting motion structural units for analysis, demonstrating the effectiveness of this approach on skeletons with both identical and different topological structures. 2) Automatically constructing a shared latent space between skeletons with different topological structures using deep neural networks to resolve issues of overall motion redirection topological inconsistency. Motion data from different topological skeletons is transformed into a shared latent space, with source motion features encoded through an attention mechanism. Experimental results confirm the efficacy of this method. 3) Implementing dynamic motion modeling using Transformer self-attention networks, tailored to motion part encoding structures. This model, combining body-part strategies, enables more precise full-body (including hand skeleton) motion redirection. Through posture-aware attention networks, joint weights within each body part are dynamically predicted, and a shared latent space is constructed via feature aggregation, demonstrating enhanced flexibility and adaptability. Experimental results demonstrate that the proposed model exhibits stability, accuracy, and scalability in motion retargeting. ﹀
参考文献总数：	42
插图总数：	18
插表总数：	6
馆藏号：	本080901/24024
开放日期：	2025-06-13

附件下载