中文题名: | 基于XLNet模型及GRU-CRF的命名实体识别方法研究 |
姓名: | |
保密级别: | 公开 |
论文语种: | 中文 |
学科代码: | 080901 |
学科专业: | |
学生类型: | 学士 |
学位: | 理学学士 |
学位年度: | 2021 |
学校: | 北京师范大学 |
校区: | |
学院: | |
第一导师姓名: | |
第一导师单位: | |
提交日期: | 2021-06-19 |
答辩日期: | 2021-05-11 |
外文题名: | Research on Named Entity Recognition method based on XLNet model and GRU-CRF |
中文关键词: | |
外文关键词: | Named Entity Recognition ; BERT ; XLNet ; Language model ; GRU structure ; CRF |
中文摘要: |
命名实体识别(Named Entity Recognition, NER)是将文本中具有特定意义的实体,主要是将人名、地名、机构名等识别出来,然后标注其类别。它是自然语言处理的基础任务,完成了命名实体识别任务能为自然语言处理后续的工作提供便利。在命名实体识别领域里,相比于当今所流行的使用语言模型BERT(Bidirectional Encoder Representations from Transformers)的方法,本文引入了新的语言模型XLNet(Generalized Autoregressive Pretraining for Language Understanding)进行预训练,得到词向量,然后将结果输入到门控循环单元(Gated Circulating Unit,GRU)结构中提取特征,最后经过条件随机场(Conditional Random Field,CRF)计算输出,对输出的结果进行分析评估。实验结果表明,对于NER任务,本文采用的XLNet-GRU-CRF方法相较于一般使用BERT模型的方法在数据集上处理后得到的结果准确性、召回率、F1值都有提升。因此在一定的文本条件下,此方法可以作为处理NER任务的更好选择。 |
外文摘要: |
Named Entity Recognition’s(NER) purpose is to recognize the entities with specific meaning in the text and label their category, such as person name, place name, organization name, etc. It is the basic task of natural language processing. Completing this task can facilitate the subsequent work of natural language processing. In the field of NER, compared with the current popular language model Bidirectional Encoder Representations from Transformers(BERT), this thesis introduces a new language model Generalized Autoregressive Pretraining for Language Understanding(XLNet) for pre-training to obtain word vectors and then the results will input into the gated circulating unit(GRU) structure to extract features. Finally, the output is calculated through conditional random field(CRF) and will be analyzed and evaluated. The experimental results show that, for NER task, the accuracy, recall rate and F1 score of the results obtained by XLNet-GRU-CRF method are improved compared to the method that generally uses the BERT model on the data set. Therefore, this method can be a better choice for handling NER tasks under certain text conditions. |
参考文献总数: | 18 |
作者简介: | 北京师范大学 人工智能学院 本科生 邓淳中 |
插图总数: | 12 |
插表总数: | 5 |
馆藏号: | 本080901/21032 |
开放日期: | 2022-06-19 |