- 无标题文档
查看论文信息

中文题名:

 两类模型下的变点推断及假设检验问题    

姓名:

 李梦    

保密级别:

 公开    

论文语种:

 chi    

学科代码:

 0714Z2    

学科专业:

 应用统计    

学生类型:

 博士    

学位:

 理学博士    

学位类型:

 学术学位    

学位年度:

 2024    

校区:

 北京校区培养    

学院:

 统计学院    

研究方向:

 变点理论、生物统计    

第一导师姓名:

 童行伟    

第一导师单位:

 统计学院    

提交日期:

 2024-06-15    

答辩日期:

 2024-05-09    

外文题名:

 Change Point Inference And Hypothesis Test Issues Under Two Types Of Models    

中文关键词:

 变点 ; 计量经济 ; Cox 比例风险模型 ; 区间删失数据 ; 互斥性假设检验 ; 稳健性    

外文关键词:

 Change Point ; Econometrics ; Cox Proportion Hazards Model ; Interval Censored Data ; Exclusive Hypothesis Test ; Robustness    

中文摘要:

在计量经济学和生物统计等许多统计相关领域,对参数不稳定性的研究受到广泛关注,尤其是变点模型的推导和应用越来越多。一般来说,变点模型指模型内的相关参数会由于某个划分变量的取值不同而产生变化,本文将聚焦于两个特定模型下的变点统计推断问题。

在已有研究中,关于线性回归模型中变点的估计和推断往往基于变点的固定性,忽略了变点本身的不确定性,而考虑变点本身不确定性的研究又忽略了变点在个体间的差异性。因此,本文第一部分考虑了基于单划分变量的随机变点线性回归模型,假设变点为独立同分布的未知随机变量,建立了相关估计的相合性和渐近正态性,并提出两种期望最大化(Expectation-Maximization, 简称EM)算法估计程序,然后应用得分统计量的最大值(Supremum, 简称SUP)检验以及变点类型(Change-point Type, 简称CPT)检验来分别检测变点的存在性和随机性。进而,模拟研究用来评估所提估计和检验方法的表现。最后,将所提方法应用于经济学数据集进行得到相应结论。

在生存分析中,Cox比例风险模型(Cox Proportional Hazards Model)广泛应用于在独立观察假设下估计潜在危险因素与生存时间的分布或疾病发生率之间关联的问题上。将变点引入Cox比例风险模型可以为医学数据的拟合提供帮助,对预测人们的患病风险、正确评价医学手段的治疗效果和进一步改进治疗方案都具有重要的临床意义。然而,目前Cox比例风险模型上的变点分析局限在右删失数据中。本文第二部分将针对于另一类重要的删失数据——I型区间删失数据(又称当前状态数据)构建基于单划分变量的变点Cox比例风险模型,给出了基于极大剖面似然估计的三步估计算法,并推导非参数极大似然估计的相合性与渐近性质,其次构造SUP检验统计量进行变点存在性检验,最后通过数值模拟评估所提估计和检验方法的表现。

除此之外,本文还研究了一类特殊的假设检验问题——互斥性假设检验(Exclusive Hypothesis Test, 简称EHT)的稳健统计方法。该类假设检验问题从基因多效性(Genetic Pleiotropy)的检验抽取而来,现有检验方法的缺点有二:大部分方法只适用于误差项服从正态分布的情况,且不够稳健;一部分方法存在保守的经验I类错误。 为了解决此问题,本文第三部分提出了负2倍的损失差检验(Loss-difference Test, 简称LDT)统计量以及两种检验方法——(p+1)步稳健检验((p+1)-stage Robust Test)和组合式稳健检验(Combined Robust Test),并且验证了它们的有效性。随后的数值模拟结果也说明这两种方法在稳健性上表现良好,尤其是组合式稳健检验还能克服保守的经验 I 类错误。最后,将所提方法应用于中国台风数据集,并通过污染数据的手段来对比两种新方法与两种代表性旧方法的效果。

外文摘要:

In many statistics-related fields such as econometrics and biostatistics, research on parameter instability has received widespread attention, especially in the derivation and application of change point models. Generally speaking, the change point model means that the relevant parameters in the model will change due to different values of a certain partition variable. This article will focus on the change point statistical inference problem under two specific models. 

In existing research, the estimation and inference of change points in linear regression models are often based on the fixity of the change point, ignoring the uncertainty of the change point itself. Studies that consider the uncertainty of the change point itself ignore the change point. Differences between individuals. Therefore, the first part of this article considers a random change point linear regression model based on a single partition variable, assuming that the change point is an independent and identically distributed unknown random variable, establishes the consistency and asymptotic normality of the relevant estimates, and proposes two expectation-maximization (EM) algorithm estimates the procedure, and then applies the supremum (SUP) test of the score statistic and the change-point type (CPT) test to detect the existence and type of random change points, respectively. Further, simulation studies evaluate the performance of the proposed estimation and testing methods. Finally, the proposed method was applied to economic datasets to obtain corresponding conclusions.

On the other hand, in survival analysis, the Cox proportional hazards model is widely used in estimating the association between potential risk factors and disease incidence under the assumption of independent observations. Introducing change points into the Cox proportional hazards model can help fit medical data, and has important clinical significance for predicting people's disease risks, correctly evaluating the therapeutic effects of medical means, and further improving treatment plans. However, the current analysis of change points in Cox proportional hazards models is limited to right censored data. The second part of this article will be constructed for another important type of censored data, case I interval-censored data, also known as current state data. Based on the change-point Cox proportional hazards model of a single partition variable for this type of data, a three-step estimation algorithm based on maximum profile likelihood estimation is given, and the consistency and asymptotic properties of non-parametric maximum likelihood estimation are derived, and then the SUP test statistic is constructed The variables were tested for the existence of change points. Finally, numerical simulation was used to evaluate the performance of the proposed estimation and testing methods.

In addition, we also studied a special type of hypothesis test about robust statistical methods for the exclusive hypothesis test (EHT). This type of hypothesis test is extracted from detecting the genetic pleiotropy. Existing methods have two drawbacks. One is that most methods are only applicable when the error term obeys a normal distribution and is not robust enough. The other is that some methods have conservative empirical type I errors. To address this issue, the third part of this article proposes a -2 loss-difference test statistic and two test methods, (p+1)-stage robust test and combined robust test, and verifies their effectiveness. The subsequent numerical simulation results also demonstrate that these two methods perform well, especially the combined robust test, which can overcome conservative empirical Type I errors. Finally, we will apply the proposed method to the Chinese typhoon dataset and compare the effectiveness of two new methods with two representative old methods through polluting data.

参考文献总数:

 131    

馆藏地:

 图书馆学位论文阅览区(主馆南区三层BC区)    

馆藏号:

 博0714Z2/24001    

开放日期:

 2025-06-16    

无标题文档

   建议浏览器: 谷歌 360请用极速模式,双核浏览器请用极速模式