查看论文信息

查看全文

查看论文信息

中文题名：	基于弱监督的人体姿态估计
姓名：	鲍习照
保密级别：	公开
论文语种：	中文
学科代码：	081203
学科专业：	计算机应用技术
学生类型：	硕士
学位：	工学硕士
学位类型：	学术学位
学位年度：	2022
校区：	北京校区培养
学院：	人工智能学院
研究方向：	人体姿态估计
第一导师姓名：	胡晓雁
第一导师单位：	北京师范大学人工智能学院
提交日期：	2022-06-09
答辩日期：	2022-06-09
外文题名：	HUMAN POSE ESTIMATION BASED
中文关键词：	人体姿态估计 ; 服装估计 ; 弱监督
中文摘要：	︿在计算机视觉领域，人体姿态估计是一个非常热门的研究方向，具有十分广阔的发展前景和应用市场。近年来，随着深度学习的快速发展，无论在精度还是时间成本上，人体姿态估计算法都得到了显著提升。但与此同时，带有服装的人体姿态估计，成为一项较为困难且极具挑战性的任务。首先，人体姿态估计本身由于服装遮挡等因素导致人体的体型、姿势估计不准确。其次，由于服装种类繁多和变形复杂，使得服装的估计存在很大难度。最后，针对于这一任务的大规模数据集目前也相当匮乏。由于服装数据采集困难、采集成本高昂等原因，已有的服装数据集往往规模较小，包含的服装种类较少。这对于需要大量训练数据的深度学习方法来说，也是一个困难之处。本文提出了一种多阶段的基于弱监督的人体姿态估计方法，充分利用标注信息较少的数据进行学习，从而估计出人体的体型、姿势和服装的变形。第一阶段利用多视角的人体二维关键点，回归出SMPL人体模型参数。一方面使用多视角信息作为弱监督信息可以避免单视角的深度歧义性问题，得到较为准确的人体姿势；另一方面相对于直接使用三维关键点回归等方法，多视角的方法只使用了较少且较容易获得的监督信息。第二阶段，在人体体型、姿势估计较为准确的基础上，本文经过比较和选择，使用基于PCA的服装模型表示服装，并使用服装的二维关键点作为弱监督信息回归出服装模型的参数，得到服装的粗糙估计。在第三阶段，对于每一类服装，本文都预定义了一个嵌入图来描述服装的变形。在第二阶段服装粗糙估计的基础上，使用服装的掩膜信息来进一步调整服装的变形。为了便于训练，本文还构建了一个多视角合成数据集。在该合成数据集和BCNet数据集上的实验表明，本文方法在精度上达到了和已有的使用强监督信息的方法同一水平。由于本文仅使用了较容易获取的弱监督信息，因此在训练数据上具有较大的优势。在DeepFashion2数据集上的实验表明，本文提出的方法在监督信息较匮乏的数据集上可以充分利用已有的弱监督信息进行微调，相比于由于缺少丰富标注信息而无法进行训练或调整的强监督方法取得了更好的效果，具有一定的研究意义和应用价值。﹀
外文摘要：	︿ In the field of computer vision, human pose estimation is a very popular research direction, which has a very broad development prospect and application market. In recent years, with the rapid development of deep learning, both in terms of accuracy and time cost, human pose estimation algorithms have been significantly improved. But at the same time, human pose estimation with clothing has become a more difficult and challenging task. Firstly, the estimation of human body pose itself is inaccurate due to clothing occlusion and other factors. Secondly, due to the wide variety and complex deformation of clothing, it is very difficult to estimate clothing. Finally, large-scale datasets for this task are also quite scarce. Due to the difficulty and high cost of clothing data collection, the existing clothing datasets are often small in scale and contain fewer types of clothing. This is also a difficulty for deep learning methods that require a large amount of training data. In this paper, a multi-stage human pose estimation method based on weak supervision is proposed, which makes full use of the data with less annotation information for learning, so as to estimate the human body shape, pose and clothing deformation. In the first stage, the parameters of SMPL human model are regressed by using the two-dimensional key points of human body from multiple perspectives. On the one hand, using multi-view information as weak supervision information can avoid the deep ambiguity of single view and obtain more accurate human pose. On the other hand, compared with the direct use of three-dimensional key point regression and other methods, the multi view method only uses less and easier supervision information. In the second stage, based on the more accurate estimation of human body shape and pose, after comparison and selection, this paper uses the clothing model based on PCA to represent the clothing, and uses the two-dimensional keypoints of the clothing as the weak supervision information to regress the parameters of the clothing model to obtain the rough estimation of the clothing. In the third stage, for each kind of clothing, an embedded graph is predefined to describe the deformation of clothing. Based on the second stage clothing rough estimation, the mask information of clothing is used to further adjust the deformation of clothing. In order to facilitate training, this paper also constructs a multi view synthetic data set. Experiments on the synthetic data set and BCNet dataset show that the accuracy of this method is the same as that of the existing methods using strong supervision information. Because this paper only uses the relatively easy to obtain weak supervision information, it has great advantages in training data. The experiment on DeepFashion2 dataset shows that the method proposed in this paper can make full use of the existing weak supervision information to adjust on the dataset with less supervision information. Compared with the strong supervision methods that cannot be trained or adjusted due to the lack of rich annotation information, it has achieved better results and has certain research significance and application value. ﹀
参考文献总数：	60
馆藏号：	硕081203/22004
开放日期：	2023-06-09

附件下载