查看论文信息

查看全文

查看论文信息

中文题名：	基于子图上下文联系的注视点转移模型
姓名：	李想
保密级别：	公开
论文语种：	中文
学科代码：	081001
学科专业：	通信与信息系统
学生类型：	硕士
学位：	工学硕士
学位类型：	学术学位
学位年度：	2019
校区：	北京校区培养
学院：	信息科学与技术学院
研究方向：	图像与语音信号处理
第一导师姓名：	张家才
第一导师单位：	北京师范大学信息科学与技术学院
提交日期：	2019-06-05
答辩日期：	2019-05-31
外文题名：	Modeling of Gaze Shifting Path using Context Linkage in Subgraph
中文关键词：	视觉注意 ; 眼动轨迹 ; 子图显著性 ; 路径相似性
中文摘要：	︿视觉注意系统可以自动从大量冗余的信息中选取最重要的部分以供大脑进一步的处理，从而人脑可以在极短时间内处理大量的视觉数据。显式的视觉注意可以体现为注视点和眼动轨迹。注视点表示人们关注了场景中哪些部分，而眼动轨迹表示了人们是如何观察场景中感兴趣的部分。相关研究表明，视觉注意转移在自底向上和自顶向下两种机制的共同引导下完成。已有研究对这两种机制进行建模，构建视觉注意转移模型，并在诸如辅助驾驶、路口流量监测、自动化物流等领域得到广泛应用。然而视觉注意转移模型相关研究仍有较多困难需要克服。首先，眼动轨迹的个体差异问题，即相同场景下不同个体的实际眼动轨迹不尽相同，如何挖掘出被试间眼动轨迹的共性规律仍是一个有待研究的问题；其次，视觉注意转移建模的局限性，目前已有的视觉注意转移模型多是基于自底向上机制的，而仅靠图像的低层次特征无法很好预测图像的显著性区域，如何对自顶向下机制进行建模并与已有模型进行整合仍是一个研究重点，同时，现有的模型忽略了眼动轨迹内在的顺序信息，而有研究表明眼动轨迹内在的顺序信息在视觉注意引导过程中起着不可忽视的作用，如何对其进行建模仍有待探索；最后，视觉注意转移模型的客观评价问题，目前已有的眼动轨迹度量算法多是只考虑了位置、持续时间和顺序中的一个因素或两个因素，同时考虑全部因素的算法目前仅有扫视匹配算法，然而由于该算法需要对图像划分区域，因此在某些场景下存在度量错误，如何更加全面地客观评价视觉注意转移模型仍没有一个较好的解决方案。基于上述问题，本文主要工作包括： 1) 不同个体的眼动轨迹间的差异性与共性的研究。该部分研究首先着眼于眼动轨迹数据间的个体差异性与共性，并基于此提出将图像的时序显著图作为眼动轨迹数据的表示方法，对比实验结果表明该数据结构比静态显著图能够更有效反映个体间的共性规律，为模型从数据中学习规律提供了可能； 2) 注意转移模型的研究。该部分研究将视觉注意转移模型分为子图显著性的建模和子图间转移规律的建模两部分，前者提供了子图自身的显著性信息，而后者考虑了子图间的上下文联系，并将子图的高低层次特征作为两个学习模型的输入，使得本文的视觉注意转移模型整合了自底向上和自顶向下两种机制，通过多组实验证明了所提出模型的有效性； 3) 眼动轨迹间相似性度量方法的研究。为了更好地客观评价注意转移模型的效果，通过分析已有眼动轨迹相似性度量方法的不足，该部分研究基于编辑距离算法进行改进，提出一种新的度量方法，在模拟数据以及真实数据上的实验结果验证了新度量方法的有效性。本研究中对视觉注意转移规律进行了探索，提出了新的眼动轨迹数据的表达形式，有效提取了眼动轨迹数据中的共性规律；通过将视觉注意转移模型拆分为子图显著性模型和子图间转移模型，使得该模型在考虑了子图间上下文联系的同时整合自底向上和自顶向下两种机制；最后，本文提出了基于编辑距离算法改进的新的眼动轨迹度量方法，能够更好地度量眼动轨迹间的相似性，为客观评价模型与研究眼动轨迹规律提供了新的工具。﹀
外文摘要：	︿ The visual attention system automatically selects the most important parts from a large amount of redundant information for further processing by the cerebral cortex, allowing the human brain to process large amounts of visual data in a very short time. Explicit visual attention can be expressed as gaze points and scanpaths. The gaze point indicates that people are paying attention to which parts of the scene, and the scanpath shows how people observe the parts of interest in the scene. Related studies have shown that visual attention transfer is accomplished under the joint guidance of both bottom-up and top-down mechanisms. The existing research has established a visual attention transfer model by modeling these two mechanisms, and has been widely used in fields such as assisted driving, intersection flow monitoring, and automated logistics. However, there are still many difficulties in the research related to visual attention transfer models that need to be overcome. First of all, the individual difference of scanpaths, that is, the actual scanpaths of different individuals in the same scene is not the same. How to dig out the common law of scanpath between subjects is still a problem to be studied; secondly, the limitations of visual attention transfer modeling, the existing visual attention transfer models are mostly based on the bottom-up mechanism, and the low-level features of images alone fail to predict the saliency region of the image well, how to build the top-down mechanism and the integration of the model with the existing model is still a research focus. At the same time, the existing model ignores the order information inherent in the scanpath, and studies have shown that the order information inherent in the scanpath plays an important role in the visual attention guiding process, how to model it remains to be explored; finally, to evaluate the performance of visual attention transfer model is a great challenge, the existing scanpath metric algorithm mostly considers only one factor in position, duration and order or two factors, the algorithm that considers all factors at the same time, currently only ScanMatch algorithm, however, since the algorithm needs to divide the image into regions, so in some cases there is a measure of error, how to be more comprehensive and objective evaluation of visual attention metastasis model is still not a good solution. Based on the above issues, the main work of the thesis includes: a) The study of the differences and commonalities between the scanpaths of different individuals. This part of the research first focuses on the individual differences and commonalities between the scanpath data, and based on this, proposes the time series saliency map of the image as the representation method of the scanpath data. The comparison experimental results show that the data structure can be more than the static saliency map. Effectively reflects the commonality between individuals, providing a possibility for the model to learn the law from the data. b) Study on the visual attention transfer model. This part of the study divides the visual attention transfer model into two parts: the subgraph saliency modeling and the subgraph transfer rule. The former provides the saliency information of the subgraph itself, while the latter considers the contextual relationship between the subgraphs. The high-low level features of the sub-graphs are used as input to the two learning models, which makes the visual attention transfer model of this paper integrate the bottom-up and top-down mechanisms, and prove the validity of the proposed model through experiments. c) Research on similarity measurement methods between scanpaths. In order to better evaluate the effect of the visual attention transfer model objectively, by analyzing the shortcomings of the existing scanpath similarity measurement method, this part of the research is based on the edit distance algorithm to improve and propose a new measurement method. Experimental results on simulated data and real data validate the effectiveness of the new metric. In this study, the law of visual attention transfer was explored, and the expression form of new scanpath data was proposed, which effectively extracted the common law in scanpaths. By splitting the visual attention transfer model into the subgraph saliency model and the subgraph transfer model, the model considers the context linkage in subgraph while integrating the bottom-up and top-down mechanisms. Finally, this paper proposes a new scanpath metric based on the improved editing algorithm, which can better measure the similarity between scanpaths, and provides a new tool for objective evaluation model and research on scanpath. ﹀
参考文献总数：	75
作者简介：	李想，男，北京师范大学信息科学与技术学院研究生，发表论文《Xiang Li, Jiacai Zhang*. A Novel Method for Scanpath Comparison based on Levenshtein Distance[C]. International Conference on Signal Processing and Information Communications(ICSPIC). 2019. DOI: 10.18178/ijsps》
馆藏号：	硕081001/19007
开放日期：	2020-07-09

附件下载