中文题名: | 基于本体和关联数据的引文分析方法研究 |
姓名: | |
保密级别: | 公开 |
论文语种: | 中文 |
学科代码: | 120501 |
学科专业: | |
学生类型: | 硕士 |
学位: | 管理学硕士 |
学位类型: | |
学位年度: | 2018 |
校区: | |
学院: | |
研究方向: | 语义网,信息计量 |
第一导师姓名: | |
第一导师单位: | |
提交日期: | 2018-06-10 |
答辩日期: | 2018-05-25 |
外文题名: | Research on the Citation Analysis Method Based on Ontology and Linked Data |
中文关键词: | |
中文摘要: |
随着科学技术的高速发展,科技文献的数量迅速增长,快速而全面地把握学科领域的发展态势和演化过程越来越重要。自从引文分析方法被提出以后,很快就成为一种学科领域的分析方法。经过几十年的发展,引文分析法在理论和实践方面都取得了一定的进展,已经广泛应用于科学知识评价、科学发展模式揭示和科学前沿探测等方向,对科技创新和科研决策都有着重要的意义。然而,传统的引文分析方法和工具过度依赖于引文数据库,所以在分析过程中存在以下问题:将所有引用行为视为同等重要;各种统计指标均以文献作者本人标注的引用次数为依据;只能揭示文献之间引用与否,不能揭示更深层次的引用语义关系。在此背景下,研究人员从不同角度对作者的引用动机和引用行为进行分析,基于内容的引文分析方法也被随后提出,试图从施引文献与被引文献的引用内容出发,分析深层次的引用功能与引用动机。
本论文围绕传统引文分析中存在的问题提出了基于本体和关联数据的引文分析方法。该方法旨在通过本体和关联数据技术来对引用数据进行规范化描述,并且利用SPARQL检索式来提取特定维度的引文数据,实现全新视角的引文分析,进而克服传统引文分析方法存在的缺陷。
本论文主要包括理论、方法和实验等三部分研究内容。在理论研究部分,主要介绍了引文分析法和基于全文的引文分析、开放引用倡议与开放引文语料库、本体和关联数据、引文本体和引文关联数据等内容,为后续提出基于本体和关联数据的引文分析方法奠定了理论基础。在方法研究部分,提出了基于本体和关联数据的引文分析方法。该方法的技术路线主要包括引文本体的构建、引文关联数据的发布和引文数据的语义查询三个步骤。本论文分别从题录引文数据角度和全文引文信息角度出发,构建了题录引文本体BCO和全文引文本体FCO,进行了引文关联数据发布,并通过构建SPARQL检索式实现了引文数据提取、检索结果的可视化呈现等功能。在实验研究部分,本论文分别选取了CSSCI和Web of Science中引文分析领域的学术文献构成基础数据集,并在此基础上完成了基于题录数据视角和基于全文信息视角的引文分析方法实验,实现了预定的分析目标。
﹀
|
外文摘要: |
With the rapid development of science and technology, the number of scientific literature has grown rapidly. It is more and more important to grasp the development and evolution trend of the research field quickly. Since the method of citation analysis have been proposed, it has quickly become a prolific method of analyzing subject areas. After decades of development, the citation analysis method has made great progress both theoretically and practically and played a great role in scientific knowledge evaluation, scientific development model disclosure, and scientific frontier detection. However, the traditional citation analysis methods and tools rely heavily on citation databases. Therefore, the following problems exist in the analysis process: (1) All citation behaviors are treated as equally important. (2) All statistical indicators are based on the number of citations which are annotated only by the author. (3) References can only show whether the paper is cited, and cannot reveal deeper citation semantic relationships. In this context, different researchers analyzed the citation motivation and citation behavior from different angles. The content-based citation analysis method has also been proposed, which tries to analyze the deep relationship between citing paper and cited paper.
In this context, this paper proposes a citation analysis method based on ontology and linked data. The method aims at using ontology and linked data technology to standardize the reference data, and using SPARQL queries to extract the citation data of a specific dimension, thereby realizing a new perspective of citation analysis.
Specifically, the study includes the following three parts. (1) Theoretical research. In this section, we introduced the development of citation analysis, full-text citation analysis, open citation initiative, open citation corpus, ontology, linked data, citation ontology and citation linked data. This four-part related knowledge lays a theoretical foundation for the citation analysis method based on ontology and linked data. (2) Method research. In this section, a citation analysis method based on ontology and linked data was proposed. The key technical steps of the method mainly include three parts: the construction of the citation ontology, the publication of the citation linked data, and the semantic query of the citation data. This paper constructed a Bibliographic Citation Ontology(BCO) and a Full-text Citation Ontology(FCO) from the perspective of the bibliographic citation data and the full-text citation information, respectively. Then, we published the citation linked data and realized the extraction and visual presentation of citation data by constructing the different SPARQL queries. (3) Experimental research. Finally, in the experimental section, we selected the academic literature datasets in the field of citation analysis in CSSCI and Web of Science. Based on this, two experiments, namely citation analysis based on the perspective of bibliographic data and citation analysis based on full-text information, were conducted separately. The experiment achieved the predetermined analysis goal.
﹀
|
参考文献总数: | 64 |
作者简介: | 发表论文情况: 1.石泽顺, 肖明. 基于RelFinder的图情学科关联数据语义关系发现实践[J]. 图书情报工作, 2017,61(17): 139-148. (第一作者;CSSCI核心期刊) 2.石泽顺, 肖明. 基于PoolParty的图情学科SKOS叙词表构建研究[J]. 图书馆学研究, 2017,(23):20-30. (第一作者;CSSCI核心期刊) 3.石泽顺, 肖明. 基于网络叙词表的图情学科SKOS构建与可视化研究[J]. 情报学报. (第一作者;CSSCI核心期刊) 4.石泽顺, 肖明. 基于引文关联数据的引文路径识别和引文网络可视化研究[J]. 情报科学. (第一作者;CSSCI核心期刊) 5.石泽顺, 孙博阳. 开源电子资源管理系统CORAL研究[J]. 图书情报工作, 2017(4): 130-137. (第一作者;CSSCI核心期刊) 6.Shi, Z. S. & Xiao, M. New Citation Analysis Perspective Based on Ontology and Linked Data[C]// Proceedings of the 16th International Conference on Scientometrics and Informetrics. ISSI, 2017: 1598–1599. (第一作者;国际会议) 参与项目情况: 1.国家社科基金项目“基于语义识别的引文分析理论、方法与应用研究”(项目编号:16BTQ073),负责项目核心实验的设计与实施,并有多篇项目成果发表。 2.国家社科基金项目“基于多方法融合的图书馆学情报学知识图谱实证研究”(项目编号:11BTQ019),负责项目报告的撰写与整理工作。 3.国家社科基金重大项目“《王重民全集》编纂”(项目编号:17ZDA296),负责项目前期的资料收集与调研工作。 |
馆藏号: | 硕120501/18002 |
开放日期: | 2019-07-09 |