中文题名: | 基于增长模型的缺失数据处理方法的比较与应用 |
姓名: | |
学科代码: | 040201 |
学科专业: | |
学生类型: | 硕士 |
学位: | 理学硕士 |
学位年度: | 2015 |
校区: | |
学院: | |
研究方向: | 心理统计与测量 |
第一导师姓名: | |
第一导师单位: | |
提交日期: | 2015-06-05 |
答辩日期: | 2015-05-18 |
外文题名: | COMPARISON AND APPLICATION OF METHODS FOR HANDLING MISSING DATA WHEN FITTING A LATENT GROWTH MODEL |
中文摘要: |
追踪研究中缺失数据十分常见,而且相比于横断研究更加复杂、难处理。对含有不同模式的缺失数据的追踪数据集,基于潜变量增长模型,考察基于不同前提假设的缺失数据处理方法,即Diggle-Kenward选择模型、模式混合模型和ML方法的优劣,通过Monte Carlo模拟研究,比较各个方法对模型中增长参数估计精度及其标准误估计的差异,并同时考虑样本量、永久缺失比例、暂时缺失比例以及目标变量分布形态的影响。结果表明,(1) 永久缺失比例是影响各个方法参数估计精度的最主要因素。在Diggle-Kenward选择模型的假设下,符合前提假设的Diggle-Kenward选择模型的参数估计精度固然普遍最好,但在永久缺失比例较小(不超过10%)时,模式混合模型和ML方法与其差异很小,估计结果处于可接受的范围;当永久缺失比例大时,各个方法之间的差异明显变大,在分析时谨慎选择符合前提假设的模型更显重要。当缺失数据满足ML方法对缺失机制的假设时,虽ML方法在符合自身假设的模拟条件下呈现出一定的精度提高的趋势,但Diggle-Kenward选择模型的结果变化不是很大,即使在违背假设的模拟条件中出现精度变差的趋势,但变化幅度依然较为稳定。模式混合模型的结果较不稳定,且存在较严重的不收敛问题。(2) 目标变量分布的偏态程度对增长参数方差的影响要比对增长参数均值的影响更明显。对增长参数均值,永久缺失比例与偏态程度之间存在一定的交互作用,当永久缺失比例高时,参数估计精度受到偏态程度的影响更敏感,尤其对Diggle-Kenward选择模型来说,这种表现更加明显。对增长参数方差,样本量与偏态程度之间存在明显的交互作用,当样本量小时,参数估计精度受到偏态程度的影响更敏感。(3) 样本量增大会提高参数估计精度,暂时缺失比例对参数估计精度的影响比较有限。
﹀
|
外文摘要: |
In longitudinal studies, missing data are ubiquitous. The missing not at random (MNAR) mechanism may bias parameter estimates and even distort the study results. This article compared three techniques for handling different types of missing data (i.e., the maximum likelihood approach, the Diggle-Kenward selection model and the pattern mixture model) through a Monte Carlo simulation study based on a five-wave longitudinal dataset. Estimates of parameters and standard errors using each of these methods were contrasted under different model (Diggle-Kenward selection model and ML approach) assumptions.Then four influential factors were considered: four levels of dropout missingness, three levels of intermittent missingness, four levels of sample size and four levels of distribution shape (i.e. skewness and kurtosis). The results indicated that (1) The level of dropout missingness was the major factor affecting the parameter estimation precision. Under the Diggle-Kenward selection model assumptions, the Diggle-Kenward selection model certainly performed best. However, with a low dropout level (≤10%), parameter estimates using the ML approach and the pattern mixture model differed little slightly from those of the Diggle-Kenward selection model; with higher percentages of dropouts, the Diggle-Kenward selection model had obviously better performance than the other two approaches. Under the ML approach assumptions, the ML method obtained better performance than that under Diggle-Kenward selection model assumptions while the Diggle-Kenward selection model changed little and kept stable. The pattern mixture model results were unstable and had severe problems of non-convergence. (2) When fitting a growth curve model, compared to the means of the latent variables(μi and μs), the variances (σi2 and σs2) were influenced much more by the distribution shape (i.e. the degree of skewness and kurtosis) to a larger extent.For the means of the latent variables, the percentages of dropouts and the degree of skewness and kurtosis had significant interactions. With high dropout percentages, the estimation precision was more sensitive to be influenced by the degree of skewness and kurtosis, especially for the Diggle-Kenward selection model. For the variances of the latent variables, the sample size and the degrees of skewness and kurtosis have significant interactions. With small sample sizes, the distribution shape affected more on the estimation precision. (3) In all, the sample size would improve the estimation precision while the intermittent percentages had limited effects.
﹀
|
参考文献总数: | 66 |
馆藏号: | 硕040201/1508 |
开放日期: | 2015-06-05 |