中文题名: | 高中阶段常用文言实词自动命题研究 |
姓名: | |
保密级别: | 公开 |
论文语种: | 中文 |
学科代码: | 050102 |
学科专业: | |
学生类型: | 硕士 |
学位: | 文学硕士 |
学位类型: | |
学位年度: | 2021 |
校区: | |
学院: | |
研究方向: | 中文信息处理 |
第一导师姓名: | |
第一导师单位: | |
提交日期: | 2021-06-15 |
答辩日期: | 2021-06-01 |
外文题名: | RESEARCH ON AUTOMATIC QUESTION GENERATION OF CLASSICAL CHINESE NOTIONAL WORDS FREQUENTLY USED IN HIGH SCHOOL |
中文关键词: | |
外文关键词: | Vocabulary test ; automatic question generation ; Classical Chinese ; frequently-used notional words ; natural language processing |
中文摘要: |
文言文承担着传承民族文化的重要使命,但很多学生却存在读不懂文言文的问题。
掌握常用文言实词是有效提升文言文阅读能力的关键。高质量的试题有利于检测和提高
学生对常用文言实词的掌握程度,但传统的人工命题方式存在耗时耗力、主观性强等缺
点。在教育信息化、智能化的背景下,本文展开了高中常用文言实词自动命题研究。首先,本文探讨了高中常用文言实词自动命题的整体策略。在命题目标层面,参考
前人的工作,归纳了高中阶段浅易文言文的文本特征,确定了高中常用文言实词的词量
及词种,探讨了文言实词的知识特性,并将知识要素转化为形式化的测试知识点;在命
题形式层面,设计了涵盖多层次语言信息的测试题型,明确了题干和选项的选取原则;
在命题策略层面,从宏观上搭建了由文言实词知识库生成模块、命题资源生成模块、机
器自动命题模块、试题难度计算模块以及试题质量评估模块等部分构成的命题框架。
其次,从命题策略出发,本文构建了一系列用于文言实词命题的基础资源。本文对 常用文言实词的读音、词性、本义、形近词、词频、词语级别、义项等属性进行了整 理,搭建了高中常用文言实词知识库;综合调用教材、高考试题、古代典籍等资源获取 了命题语料,并开展了词义标注,构建了一个包含 5.54 万条标注数据,规模超过 125 万 字的命题来源数据集,可用于机器自动命题和古汉语词义消歧研究;此外,还构建了词 类活用数据集、复音节古今异义和古今同义词表、通假字表、异读词表等资源。 接着,本文实现了文言实词自动命题和相关文本处理算法,并开展了一系列的测试 与评估。在文本处理算法上,古汉语词义消歧算法正确率达到 80%左右,可作为辅助人 工释义的有效手段;在机器自动命题上,生成了 50000 余道高中常用文言实词测试题, 并对试题难度进行了量化;在试题质量评估上,使用难度量化法计算出的难度与专业人 士评判出的难度具有强相关性,93.33%的题干句的选择、91.11%的正确项的设置和 81.11%的干扰项的设置被认为合理,这说明机器自动命题取得了不错的效果。 最后,本文初步探讨了自动命题技术及相关资源的转化问题。高中常用文言实词自 动命题技术不仅可以大大提高文言实词命题效率,弥补人工命题的不足,还可以为古汉 语教学和研究提供技术及资源支持。一方面,基于自动命题技术可以开发计算机自适应 测试,降低词汇测试成本,提高语言测试效率;另一方面,古汉语词义消歧技术和相关 词汇资源可为文言文可读性测量、古汉语历时词义演变、词典编纂等研究提供帮助。
|
外文摘要: |
Classical Chinese plays an important role of inheriting traditional culture, but many high school students are not able to learn it well. Mastering the frequently used notional words is the key to improve the reading ability of Classical Chinese. High-quality testing questions are helpful to test and solidate students' mastery of Classical Chinese notional words. However, conventional manual question generation methods are time-consuming, labor-consuming and subjective. With the blossoming of educational intelligence, this paper studies the automatic question generation(AQG) of Classical Chinese notional words frequently used in high school. Firstly, this paper discusses the overall strategy of AQG, including question generation goal, question generation form and question generation strategy. With regard to question generation goal, this paper summarizes the text features of Easy Classical Chinese, determines the quantity and types of Classical Chinese notional words frequently used in high school, discusses the vocabulary knowledge of Classical Chinese notional words, and transforms the vocabulary knowledge into testing points. With regard to question generation form, this paper designs question types covering a variety of language information and determines the principles of question generation and option generation. With regard to question generation strategy, a question generation framework is established, which is composed of vocabulary knowledge base generation module, question generation resource generation module, automatic question generation module, difficulty calculation module and question evaluation module. Starting from the strategy of AQG, a set of basic resources for AQG of Classical Chinese notional words are constructed. At first, this paper analyzes and extracts the attributes of frequently used Classical Chinese notional words, such as word pronunciation, part of speech, lexical forms, similar words, frequency, word level and word senses, and finally builds a vocabulary knowledge base. Secondly, this paper integrates the resources of Chinese Textbooks for high school, college entrance examination questions and ancient classics to get sentences that used to generate questions, carrys out word meaning annotation, and finally builds an AQG dataset with 55400 labeled data and more than 1.25 million words. This dataset is used in AQG and Classical Chinese word disambiguation. In addition, this paper also collects the different flexible uses of part of speech, words with the same meaning in ancient and modern times, words with the different meaning in ancient and modern times, interchangeable words, words with variant pronunciation and other resources. Then, this paper implements the AQG of Classical Chinese notional words and the text processing algorithm, and conducts a set of evaluations. At first, this paper tests the text processing algorithm. The accuracy of Classical Chinese word disambiguation is 80%, which is helpful to the explanation of word senses. Secondly, this paper designs the corresponding generation algorithms for different types of questions and generates more than 50000 vocabulary test questions. This paper also analyzes the relationship between the difficulty of the questions and the factors such as words, sentences and question types. Thirdly, this paper conducts a manual evaluation on the question quality. The difficulty calculated by the difficulty quantification method has a signification correlation with the difficulty judged by experts. 93.33% of the questions are considered reasonable, 91.11% of the correct options are considered reasonable, and 81.11% of the distractors are considered reasonable. The AQG technology achieves a good reslut. Finally, this paper discusses the applications of AQG technology and relevant resources. The technology can not only greatly enhance the efficiency of question generation, but also relieve the negative effect of manual question generation. In addition, the technology and resources can be applied into other applications for language teaching and reseach. On the one hand, the AQG technology can offer support to construct a computer adaptive testing(CAT), reduce testing costs and improve testing efficiency; on the other hand, the relevant resources and algorithm can could also serve as a basis for the research of text readability measurement, diachronic semantic evolution of Classical Chinese, dictionary editing.
|
参考文献总数: | 134 |
作者简介: | 王慧萍,北京师范大学中文信息处理研究所硕士,曾发表过多篇高质量学术论文,参加多次国内外学术会议。 |
馆藏号: | 硕050102/21017 |
开放日期: | 2022-06-15 |