- 无标题文档
查看论文信息

中文题名:

 基于知识图谱的学习资源分析研究    

姓名:

 秦东艺    

保密级别:

 公开    

论文语种:

 中文    

学科代码:

 081202    

学科专业:

 计算机软件与理论    

学生类型:

 硕士    

学位:

 工学硕士    

学位类型:

 学术学位    

学位年度:

 2022    

校区:

 北京校区培养    

学院:

 人工智能学院    

研究方向:

 知识工程    

第一导师姓名:

 王志春    

第一导师单位:

 北京师范大学人工智能学院    

提交日期:

 2022-06-07    

答辩日期:

 2022-06-02    

外文题名:

 Analysis of Learning Resources Based on Knowledge Graph    

中文关键词:

 知识图谱 ; 学习资源 ; 个性化学习 ; 知识增强 ; 语言预训练模型    

外文关键词:

 Knowledge Graph ; Learning Resources ; Personalized Learning ; Knowledge Enhancement ; Pre-trained Language Model    

中文摘要:

    互联网环境存储着丰富的学习资源,它们对于在线学习具有重要意义。为了充分发挥学习资源的价值,需要对其知识点以及难度进行分析和标注,从而帮助学习者获取所需的资源。目前,在线学习平台主要通过人工标注的形式完成学习资源的分析,当面对海量资源时,存在效率低、代价高的缺点。为此,本论文研究学习资源自动分析技术,提出基于知识图谱的学习资源知识点链接以及难度预测模型。本文研究内容与创新点总结如下:1)  面向编程、数学、生物三大领域,提出基于知识图谱和语言预训练模型的学习资源知识点链接模型。已有的学习资源知识点链接工作包括基于规则的关键词匹配方法,这种方法需要人工专家耗费大量精力设计规则;基于机器学习的文本分类方法,这种方法忽略了知识点本身的语义信息;基于学习资源和知识点的语义匹配方法,这种方法未引入学习资源蕴含的领域知识,准确率较低。面对这些问题,本文提出的模型在编程、数学、生物三种领域的真实数据集上,Hits@1评价指标分别是0.9820.9680.928,相较于未引入领域知识图谱的方法,我们的方法取得了最佳结果。

2)   面向编程领域,提出基于知识图谱和注意力机制的学习资源难度预测模型。在学习资源的难度预测方面,已有的方法要么是人工专家手动标注难度,要么是基于学习资源本身的语义信息,采用基于机器学习的文本分类方法,为学习资源分配难度。这显然是低效率的,且结果的准确率低,因为学习资源文本不仅包含语义信息,还蕴含丰富的领域知识。面对这些问题,本文提出的模型在编程领域的三个真实数据集上,准确率分别是0.7990.7450.796,相较于未引入领域知识图谱和未采用注意力机制的方法,我们的模型取得了最佳结果。

本文的研究的特色是引入领域知识图谱,并结合语言预训练模型和深度学习对学习资源进行分析,这样的方法可以获得学习资源丰富的语义表示,并且起到了知识增强的作用,为学习资源自动分析提供了高效可行的方法。

外文摘要:

The Internet environment stores abundant learning resources, which are of great significance to online learning. In order to give full play to the value of learning resources, it is necessary to analyze and label their knowledge points and difficulty, so as to help learners obtain the required resources. At present, online learning platforms mainly complete the analysis of learning resources in the form of manual annotation. When faced with massive resources, there are disadvantages of low efficiency and high cost. To this end, this paper studies the automatic analysis technology of learning resources, and proposes a knowledge point linking and difficulty prediction model of learning resources based on knowledge graph. The research content and innovation points of this paper are summarized as follows:

1) For the three major fields of programming, mathematics and biology, a learning resource knowledge point linking model based on knowledge graph and pre-trained language model is proposed. Existing learning resource knowledge point linking work includes rule-based keyword matching method, which requires manual experts to spend a lot of energy to design rules; machine learning-based text classification method, this method ignores the semantic information of the knowledge point itself; semantic matching method based on learning resources and knowledge points, this method does not introduce the domain knowledge contained in the learning resources, and the accuracy rate is low. Faced with these problems, the model proposed in this paper has Hits@1 evaluation indicators of 0.982, 0.968, and 0.928 on real data sets in the three fields of programming, mathematics, and biology. Compared with the method that does not introduce domain knowledge graph, our method achieves the best results.

2) For the programming field, a learning resource difficulty prediction model based on knowledge graph and attention mechanism is proposed. In terms of predicting the difficulty of learning resources, the existing methods are either manually marking the difficulty by human experts, or based on the semantic information of the learning resources themselves, using a text classification method based on machine learning to assign the difficulty to the learning resources. This is obviously inefficient, and the accuracy of the results is low, because the learning resource text not only contains semantic information, but also contains rich domain knowledge. Faced with these problems, the accuracy of the model proposed in this paper is 0.799, 0.745 and 0.796 on three real data sets in the programming field, respectively. Compared with the methods without domain knowledge graph and attention mechanism, our model achieves the best results.

The characteristics of the research in this paper are the introduction of domain knowledge graph technology, combined with pre-trained language model and deep learning to analyze learning resources. This method can obtain rich semantic representation of learning resources, and play a role in knowledge enhancement. It provides an efficient and feasible way to automatically analyze learning resources


参考文献总数:

 89    

馆藏号:

 硕081202/22011    

开放日期:

 2023-06-07    

无标题文档

   建议浏览器: 谷歌 360请用极速模式,双核浏览器请用极速模式