查看论文信息

查看全文

查看论文信息

中文题名：	海量矢量图斑相似性快速度量方法研究
姓名：	丁克松
保密级别：	公开
论文语种：	中文
学科代码：	081603
学科专业：	地图制图学与地理信息工程
学生类型：	硕士
学位：	工学硕士
学位类型：	学术学位
学位年度：	2019
校区：	北京校区培养
学院：	地理科学学部
研究方向：	智慧城市、智慧国土
第一导师姓名：	岳建伟
第一导师单位：	北京师范大学地理科学学部
提交日期：	2019-06-12
答辩日期：	2019-05-22
外文题名：	Research On Fast Similarity Measurement Method For Mass Vector Graphics
中文关键词：	图斑相似性 ; CDB直方图 ; 混合索引 ; F直方图
中文摘要：	︿人类通过比较新事物与认知经验的相似性来认识世界。在地图数据中，通过对比地图中图斑的形状及属性数据的相似性来判断地图数据是否发生变化。随着测绘技术的发展，地理空间数据的分辨率和精度越来越高，数据的覆盖范围和数据量也越来越大，全国范围内各种类型、各种用途的地理空间数据已达到海量规模，对空间数据进行高效率和高可靠性的数据更新的需求越来越迫切，在数据更新或数据融合过程中通常需要判断图斑之间的相似性。另外，在变更调查信息套合、违法用地分析等国土业务中，也需要判断图斑之间的相似性从而判断地块是否变化、审批是否合法等。目前已有的图斑相似性度量方法存在人为干预多、准确率低、效率慢等问题，无法满足相关业务管理部门对海量矢量图斑相似性比对的需求，因此研究如何快速准确有效地计算海量矢量图斑数据之间的相似性程度具有重要意义。本文以海量矢量图斑相似性快速准确度量为目标，研究了基于混合索引的候选匹配集快速获取方法和基于质心射线距离直方图（Histogram Of Centroid Distance To Boundary，以下简称CDB直方图）的矢量图斑相似性快速度量方法，为实现海量矢量图斑数据的相似性快速准确度量提供技术支持。本文主要研究内容和成果如下：（1）提出了基于一种混合索引模型的候选匹配集快速获取方法。根据地理图斑数据的自身特点，结合不同空间索引的优势，设计了一种基于行政区索引、粗分格网索引、R树索引的三级混合索引模型，将该索引模型用于快速检索候选匹配集，并与其它索引方法进行实验对比，以十万量级数据为例，实验结果表明利用本混合索引模型进行数据相交检索查询耗时比无索引快约82 s（约3倍），比格网索引快约36 s（约1.9倍），比GiST（R树）索引快约30 s（约1.7倍），验证了本混合索引的有效性和高效性。（2）提出了一种基于CDB直方图的矢量图斑相似性快速度量方法。为提高图斑相似性度量的效率和准确率，在基于F直方图的图斑几何形状相似性度量方法的基础上，提出了一种基于CDB直方图的图斑相似性度量方法，该方法可以快速准确判断两图斑是否相似以及是否似平移、似旋转或者似缩放。实验对比CDB直方图法、F直方图法和空间数据几何图形相似度模型（SDSM）的准确率和计算效率。以实验数据为例，CDB直方图法的平均准确率为97%，比F直方图法高12%，比SDSM法高5%；以实验数据的7万多匹配图斑为例，CDB直方图法计算单个图斑相似度平均耗时约为13 ms，耗时约为F直方图法的1.6%，约为SDSM法的91.8%，计算效率优于F直方图法和SDSM法。实验证明了本方法可以进行矢量图斑的相似性快速准确度量。（3）研发了海量矢量图斑相似性快速度量系统。基于C#编程语言和ArcGIS Engine GIS组件开发技术，结合并行计算，设计并研发了海量矢量图斑相似性快速度量系统，对比分析了不同相似性度量方法对土地利用现状和基本农田图斑数据的相似性度量结果，证实了本文研究对海量矢量图斑相似性度量的快速性和准确性。本文以海量矢量图斑的相似性快速准确度量为出发点，提出基于一种混合索引模型的候选匹配集快速获取方法和一种基于CDB直方图的矢量图斑相似性快速准确度量方法，具有一定的创新性，对海量矢量图斑的相似性快速准确度量具有重要意义。﹀
外文摘要：	︿ Humans recognize the world by comparing the similarities between new things and cognitive experiences. In the map data, it is judged whether the data changes by comparing the shape of the map spot and the similarity of the attribute data in the map. With the development of surveying and mapping technology, the resolution and accuracy of geospatial data are getting higher and higher, the coverage of data and the amount of data are getting larger and larger, and the geospatial data of various types and uses for various purposes has reached mass. The need for high-efficiency and high-reliability data updates for spatial data is becoming more and more urgent. In the same area, there are often multiple vector data of different uses, different types, and different scales. In reality, in order to avoid repetitive data collection and mapping of the same area, it is often necessary to integrate different vector data existing in the area to obtain conformity. Data products that are in demand, improve the reuse rate of existing data, and reduce the cost and time of data acquisition. In the process of data update or data fusion, it is usually necessary to judge the similarity between the maps. In addition, it is also necessary to judge the similarity between patches to judge whether the patches have changed and whether the examination and approval is legal in the land business such as change investigation information integration, illegal land use analysis and so on. However, the existing map similarity measure has many problems such as human intervention, low accuracy, and low efficiency, which cannot meet the relevant business management departments. For the demand of large-scale vector spot similarity comparison, it is important to calculate the degree of similarity between mass vector graphics quickly and accurately. In this paper, aiming at the rapid and accurate measure the similarity of mass vector graphics, the fast acquisition method of candidate matching sets based on hybrid index and the vector diagram based on CDB histogram( Histogram Of Centroid Distance To Boundary) were studied. This paper obtained the following conclusion: (1) Fast acquisition of candidate match sets based on a hybrid index model. According to the characteristics of geographic map data and the advantages of different spatial indexes, a three-level hybrid index model based on administrative region index, coarse grid index and R-tree index was designed. The index model is used to quickly retrieve candidate matches set, and compared with other indexing methods, taking 100,000-level data as an example, the experimental results show that the hybrid index model is about 82s(3 times) faster than no index, about 36s(1.9 times) faster than grid index, and about 30s(1.7 times) faster than GiST (R-tree) index. The validity and efficiency of this hybrid index was verified. (2) A fast measure method for vector similarity based on CDB histogram was proposed. Based on the F-histogram of the map geometric similarity measure method, this paper proposed a map similarity measure method based on the CDB histogram, which can quickly and accurately judge whether the vector shapes are similar and whether they are like translation, rotation or scaling. The accuracy and computational efficiency of CDB histogram, F histogram and SDSM wre compared. Taking experimental data as an example, the average accuracy of CDB histogram method is 97%, 12% higher than F histogram method and 5% higher than SDSM method. Taking more than 70,000 matched vector graphics of experimental data as an example, the average time cost by CDB histogram method for calculated the similarity of single matched vector graphics was about 13 ms and it takes about 1.6% of F histogram method and 91.8% of SDSM method. (3) Developed a rapid measurement system for mass vector graphics similarity. Based on the C# programming language and the mature ArcGIS Engine GIS component development technology and combined with parallel computing, a system was designed and developed for fast similarity measurement of mass vector graphics. The system mainly realizes vector data management, data index management and fast map similarity. Functions such as calculation, result query and analysis, charting and output. To find a fast and accurate method to measure the similarity of mass vector graphics, a fast acquisition method of candidate match sets based on mixed index model and a fast measure method of vector similarity based on CDB histogram were proposed. The innovation is of great significance for the fast and accurate measurement of the similarity of mass vector graphics. ﹀
参考文献总数：	0
馆藏号：	硕081603/19005
开放日期：	2020-07-09

附件下载