- 无标题文档
查看论文信息

中文题名:

 小学生评他文本分析框架构建及自动分类研究    

姓名:

 王雪    

保密级别:

 公开    

论文语种:

 中文    

学科代码:

 04010001    

学科专业:

 01教育测量 ; 评价与统计(040100)    

学生类型:

 硕士    

学位:

 教育学硕士    

学位类型:

 学术学位    

学位年度:

 2022    

校区:

 北京校区培养    

学院:

 中国基础教育质量监测协同创新中心    

第一导师姓名:

 张生    

第一导师单位:

 北京师范大学中国基础教育质量监测协同创新中心    

提交日期:

 2022-05-31    

答辩日期:

 2022-06-17    

外文题名:

 Research on the Construction of Primary School Students' Assessment Text Analysis Framework and Automatic Classification    

中文关键词:

 评价育人 ; 评他文本 ; 文本分析框架 ; 学评融合 ; 文本分类 ; 深度学习 ; ALBERT    

中文摘要:

随着《深化新时代教育评价改革总体方案》和“双减”政策的出台,构建科学的教育评价生态成为减轻学生过重的学业负担、践行核心素养育人目标和落实评价改革的关键。“学评融合”评价新理念的提出为人工智能时代的评价改革进行了初步探索,理念强调依托于数字世界与物理世界融合的育人环境,鼓励学生人人参与评价活动对高阶思维培育和社会性发展具有重要意义,是落实核心素养和学生关键能力的核心抓手之一。学生的评他文本作为同伴互评行为的关键过程性数据,蕴含了学生的认知能力、社会发展、标准意识、反思能力等重要信息,是学生评他能力培育与评价的重要数据。但受制于原有评价理念更强调评价的精准诊断,当前多数研究还主要从评他文本的准确性出发,分析维度构建缺乏育人性、系统性和全面性。同时当前评他文本的分析方法以人工编码为主,对编码员的专业性依赖较大,耗费大量人力物力,进而制约了过程性评价的大规模、常态化应用。两者导致了评他行为在日常教育教学中没有受到应有的重视。

本研究以评价育人和评价改革落地取向为目标,基于“学评融合”评价理念和评他能力分析与提升的视角,通过理论构建与两轮的德尔菲专家征询和实验学校5000条小学生真实评他文本的双编码实践检验,构建了多维度、多层级的小学生评他文本分析框架。结合分析框架的指标特点,分别搭建ALBERT单标签分类模型和ALBERT-Seq2Seq-Attention多标签分类模型,通过与多个经典模型和基线模型的对比实验,验证模型的应用效果。除此之外,本研究还对不同维度类型、文本长度对分类效果的影响进行探究,对比各维度单独建模和共同建模的效果差异,明晰模型的适用性。其主要结论如下:

1)小学生评他文本分析框架具有多维度、多层次的特点,包含3个一级维度5个二级维度,共14个类型,基于真实评他文本数据独立双编码检验,分析框架具备可操作性、科学性、系统性;

2ALBERT模型与其他经典模型相比,对分析框架的单标签维度自动分类上具有更好的分类效果,准确率从78.88%97.73%。在认知反馈和社会情感维度的预测一致性高于人工双编码一致性;ALBERT模型在标签样本不平衡的分类中表现良好,在社会情感分类维度上受文本长度的影响;

3ALBERT-Seq2Seq-Attention模型与其他基线模型相比,在分析框架的多标签维度自动分类上具有更好的分类效果,准确率从91.68%94.44%。句子修辞和条理层次类型的预测一致性高于人工双编码一致性;ALBERT-Seq2Seq-Attention模型在[18-30]文本长度区间上的预测效果最好,标签/样本比率对预测效果有负向影响;

4)与各维度单独建模相比,全维度共同建模的ALBERT-Seq2Seq-Attention模型在社会情感和个人发展维度的表现优于单独建模结果。

外文摘要:

With the introduction of the General Plan for Deepening Educational Assessment Reform in the New Era and the "Double Reduction" policy, the establishment of scientific educational evaluation ecology has become the key to reducing students' excessive academic burden, fulfilling the goals of developing core literacy and implementing the evaluation reform. The new evaluation view, " integration of learning and assessment " is an initial exploration of the evaluation reform in the era of artificial intelligence. The concept encourages every student to participate in assessment activities based on the educational environment integrating the digital and physical worlds, which is of great importance to promote high-order thinking and social development, and a crucial approach to developing core literacy and key competencies. Students’ text of review, the key process data of peer assessment behavior, is an important kind of data to cultivate his skills of assessing others, which contains important information such as students' cognitive capacity, social development, criterionawareness, reflection ability. However, subject to the emphasis on precise diagnosis in the traditional view of educational assessment, most current research mainly foucus on the accuracy of the assessment text, and the construction of the analytical dimensions lacks humanity, systemicity and comprehensiveness. At the same time, the current method of the analysing assessment text is based on manual coding, which relies heavily on the professionalism of coders and consumes a lot of human and material resources, thus limiting the large-scale and regular application of process assessment. Both have resulted in the act of evaluating others not receiving the attention it deserves in daily education and teaching.

Based on the concept of "integration of learning and assessment" and the perspective of analysis and improvement of assessment others’ skills, this study constructs a multi-dimensional and multi-level framework for the analysis of assessment texts of primary school students through theoretical construction, two rounds of Delphi expert consultation and a double-coding practical test of 5000 real assessment texts of primary school students in experimental schools. The ALBERT single-label classification model and the ALBERT-Seq2Seq-Attention multi-label classification model were constructed respectively by combining the index characteristics of the analysis framework, and the application of the models was verified through comparison experiments with several classical models and baseline models. In addition, this study also explores the influence of different dimension types and text length on the classification effects, comparing the differences in the effects of modelling each dimension separately and together to clarify the applicability of the models. The main findings are as follows.

(1) The analysis framework of primary school students' assessment text is multi-dimensional and multi-level, containing three primary dimensions and five secondary dimensions, a total of 14 types. Based on real data independent double-coding test, the analysis framework is operable, scientific and systematic.

(2) The ALBERT model has a better classification effect on the automatic classification of the single-label dimensions of the analytical framework compared with other classical models, with accuracy rates ranging from 78.88% to 97.73%. Predictive consistency in the cognitive feedback and socio-emotional dimensions are higher than manual dual-coding consistency; the ALBERT model performs well in the classification of imbalanced sample and is influenced by text length in the socio-emotional classification dimension.

(3) The ALBERT-Seq2Seq-Attention model has better classification results compared to other baseline models on the automatic classification of the multi-label dimension of the analytical framework, with accuracy rates ranging from 91.68% to 94.44%. Prediction consistency is higher for the sentence rhetoric and hierarchy labels than manual dual-coding consistency; the ALBERT-Seq2Seq-Attention model has the best prediction on the [18-30] text length interval, with a negative effect of label/sample ratio on prediction effect.

(4) Compare with each dimension modelled separately, the ALBERT-Seq2Seq-Attention model with the full dimensions modelled together outperformed the results of separate modelling on the socio-emotional and personal development dimensions.

参考文献总数:

 119    

馆藏号:

 硕040100-01/22003    

开放日期:

 2023-06-17    

无标题文档

   建议浏览器: 谷歌 360请用极速模式,双核浏览器请用极速模式