中文题名: | 多维多分类的计算机化分类测验终止规则——基于心理测量学和机器学习的视角 |
姓名: | |
保密级别: | 公开 |
论文语种: | 中文 |
学科代码: | 04020005 |
学科专业: | |
学生类型: | 硕士 |
学位: | 教育学硕士 |
学位类型: | |
学位年度: | 2022 |
校区: | |
学院: | |
研究方向: | 心理测量学 |
第一导师姓名: | |
第一导师单位: | |
提交日期: | 2022-06-18 |
答辩日期: | 2022-06-18 |
外文题名: | New Termination Rules for Multicategory Multidimensional Computerized Classification Testing: From the perspectives of Psychometrics and Machine Learning |
中文关键词: | |
外文关键词: | computerized classification testing ; termination rules ; generalized likelihood ratio ; decision tree |
中文摘要: |
目前,测验实施的形式逐渐从最初的纸笔测验向计算机化在线测验(尤其是计算机化自适应测验[Computerized Adaptive Testing, CAT])演变。由于CAT使用计算机施测并使用自适应的选题方法,因此具有许多优势。其中,当测验目的为对被试进行类别划分时,就可以基于CAT开发计算机化分类测验(Computerized Classification Testing, CCT)。
﹀
作为CCT的一个重要组成部分,终止规则不仅是区分CCT与CAT的主要特征,而且还直接影响CCT的测验精度和效率。目前,研究者主要从心理测量学的角度出发,使用似然比方法在单维二分类、单维多分类以及多维二分类的测验情境下构建出一系列终止规则。然而,更为复杂的多维多分类测验情境在现实中也很常见,但是目前还没有研究涉及多维多分类CCT。同时需要指出的是,CCT本质上是一个分类系统,而机器学习中有许多成熟的分类器可以用于解决分类问题。因此,本文聚焦于多维多分类CCT,并分别从心理测量学和机器学习的角度出发展开研究。 研究一在心理测量学的框架下构建两种基于似然比的多维多分类CCT终止规则,即多维的多分类广义似然比规则(M-mGLR)和多维的变形多分类广义似然比规则(Mr-mGLR)。本研究在二维三分类的情境下对这两种新规则的表现进行评估,并考虑被试能力维度间相关、分界曲线类型和题库结构等因素的影响。研究结果表明:(1)大多数实验条件下,M-mGLR规则的精度优于Mr-mGLR规则,而Mr-mGLR规则的效率则高于M-mGLR规则;(2)“高能力维度间相关水平、补偿性分界曲线”条件下的分类精度更高。题库结构与分界曲线的类型对分类精度有交互作用。 研究二从机器学习的视角出发,构建六种基于决策树算法的多维多分类CCT(即CART、ID3、C4.5及各自的剪枝形式),并使用与研究一相同的测验条件对它们进行全面比较。研究结果显示:(1)在本研究考虑的二级计分情境下,C4.5算法未能展示其优势,具有相对较低的分类精度;(2)剪枝技术的使用有利于进一步缩短测验长度;(3)综合考虑决策树的准确率、深度、泛化能力以及未来拓展的可能性,建议选取剪枝的ID3算法构造最终的基于树的多维多分类CCT;(4)不同类别间被试数量的比例可能是影响该研究中各算法表现的主要因素。 综上所述,针对多维多分类CCT,本文在心理测量学和机器学习的框架下分别提出2种基于似然比的终止规则和6种基于决策树的CCT,并对不同方法进行深入比较,同时探究影响其表现的因素。本文的结论可以为测量研究者与实践者今后在不同情境下选择合适的多维多分类CCT提供依据。未来希望能够开展关于分界点设置、结合过程性数据建模以及引入其他机器学习方法的研究,进一步完善CCT在实践中的应用,为教育数字化、智能化进程助力。 |
外文摘要: |
Currently, testing is evolving from traditional paper-and-pencil testing to computer-based testing (especially computerized adaptive testing, CAT). Due to its computer-based and adaptive item selection characteristics, CAT has several advantages. Based on CAT, computerized classification testing (CCT) can be developed when the test’s purpose is to classify examinees.
﹀
As an essential component of CCT, the termination rule is not only the main feature that distinguishes CCT from CAT but directly affects the accuracy and efficiency of CCT. So far, researchers have constructed a series of termination rules in unidimensional two-category, unidimensional multicategory, and multidimensional two-category test contexts, mainly from the psychometric perspective. However, no research has focused on a more common but complex context, the multidimensional multicategory context. Also, it should be noted that CCT is essentially a classification system, and there are many well-established classifiers in machine learning that can be used to solve classification problems. Therefore, this paper focuses on multidimensional multicategory CCT, conducting two studies from the perspectives of psychometrics and machine learning, respectively. Study 1 constructs two new generalized likelihood ratio-based multidimensional multicategory termination rules (i.e., the multidimensional multicategory generalized likelihood ratio rule, M-mGLR, and the multidimensional reformed multicategory generalized likelihood ratio rule, Mr-mGLR) under the psychometrics framework. The performance of these two methods was evaluated in a two-dimensional three-category context with the consideration of three influencing factors (i.e., level of correlation between coordinate dimensions, type of the classification boundary, and item bank design). The results showed that (1) the accuracy of the M-mGLR was higher than that of the Mr-mGLR under most conditions, while the efficiency of the Mr-mGLR rule was higher than that of the M-mGLR, and (2) the classification accuracy was higher under the high level of correlation between coordinate dimensions and compensatory demarcation curves. There is an interaction between the item bank design and the type of the demarcation curve on the classification accuracy. Study 2 constructs six multidimensional multicategory CCTs based on decision tree algorithms (i.e., CART, ID3, C4.5, and their pruned forms) from a machine learning perspective and comprehensively compares these new methods under the same test conditions as in Study 1. The results showed that (1) the C4.5 algorithm failed to demonstrate its advantages in the dichotomously scored context of this paper and produced relatively low accuracy; (2) the use of pruning techniques was beneficial to shorten the test length further; (3) taking into account the accuracy, maximum depth, generalization ability, and the possibility of future expansion, the pruned ID3 algorithm was recommended to construct the final tree-based multidimensional multicategory CCT; (4) the ratio of participants from different categories may be the main factor affecting the methods’ performance. In summary, for multidimensional multicategory CCT, this paper proposes two likelihood ratio-based termination rules and six decision tree-based CCTs under the framework of psychometrics and machine learning, respectively. This paper also provides an in-depth comparison of the different methods while exploring the factors affecting their performance. The results of this paper can provide a basis for psychometricians and practitioners to select appropriate multidimensional multicategory CCT in different contexts. In the future, it is still hoped that researchers can study the setting of classification boundaries, model combined process data, and introduce other machine learning methods to improve the application of CCT in practice further and contribute to the digital and intelligent process of education. |
参考文献总数: | 65 |
作者简介: | 任赫,男,汉族,河南平顶山人,主要研究方向为心理测量,已在CSSCI和SSCI期刊上发表数篇论文。 |
馆藏号: | 硕040200-05/22004 |
开放日期: | 2023-06-18 |