- 无标题文档
查看论文信息

中文题名:

 台风研究中的函数型数据分析及假设检验    

姓名:

 汪毅    

保密级别:

 公开    

论文语种:

 中文    

学科代码:

 0714Z2    

学科专业:

 应用统计    

学生类型:

 博士    

学位:

 理学博士    

学位类型:

 学术学位    

学位年度:

 2021    

校区:

 北京校区培养    

学院:

 统计学院/国民核算研究院    

第一导师姓名:

 童行伟    

第一导师单位:

 北京师范大学统计学院    

提交日期:

 2021-06-24    

答辩日期:

 2021-05-14    

外文题名:

 Functional data analysis with typhoon and hypothesis test    

中文关键词:

 台风数据 ; 高维回归模型 ; 一步正则化统计量 ; 函数型数据分析 ; 假设检验 ; 加权p-值    

外文关键词:

 typhoon data ; high-dimensional regression model ; one-step regularized estimator ; functional data analysis ; hypothesis test ; weighted p-value    

中文摘要:

在台风问题的各种研究中, 随着计算机硬件水平的快速发展, 数据收集与储存能力的不断提升, 数据的类型变得越来越丰富, 数据量也变得越来越庞大, 这使得人们都开始重视统计数据分析的重要性.

在相关研究中, 有时会出现数据的样本数量无法远远超过变量个数的情况. 此时利用经典的渐近理论无法得到合理估计结果. 近年来, 高维回归模型的统计估计方法不断涌现,但是在高维回归模型中进行假设检验一直以来都是一项充满挑战的工作. 以往高维回归模型的假设检验的构造往往对模型的假设有着很强的依赖性, 需要特异性地构造检验统计量.
然而在台风问题的相关研究中, 由于台风系统高度复杂, 使用这种特异性的构造方式会变得十分困难. 所以本文的第一部分提出了一种可以对高维回归模型的低维泛函进行统计推断的统一方法, 提出利用构造一步正则化统计量(OSRE)的方法来进行假设检验. 并且证明了一步正则化统计量的渐近正态性, 并通过高维线性模型和高维非参数线性可加模型说明该方法的合理性和泛用性. 通过数值模拟展示了该方法在有限样本下的表现, 并将其应用于特定基因与其他基因之间关联的研究中.

在台风问题的研究中经常会遇到一类特殊的高维数据, 这类数据的通常表现为图像或曲线, 这种数据被称为函数型的数据. 而传统的函数型数据模型, 很难抓住热带气旋的位置信息的特征. 所以, 本文提出将随时间变化的系数替换为台风中心的位置信息. 这使得本文提出的模型更加符合实际, 同时这使得该方法可以应用到一些较为复杂的系统的探索性研究中. 本文通过理论推导得到该方法的渐近性质, 并利用数值模拟对理论性质进行验证. 最后, 将该方法应用于热带气旋对于各个观测点风速影响的问题中, 得到了较好的预测效果.
在假设检验中还存在一类特殊的假设检验问题, 被称为互斥假设检验问题. 在互斥假设检验问题中, 原假设可以被分割为多个子原假设, 且所有的子原假设是互斥的. 这样的想法最初来源于基因多效性的检验中, 本文将这类问题引入到台风问题的研究中. 由于传统p-值计算方法并不能很好的适用于这类问题, 为此本文提出一种加权p-值方法. 为了确定权重, 本文将提出两种方案: 似然方法和BIC方法. 用理论证明和数值模拟说明了加权p-值方法的检验效果, 并将其用到台风登陆点的经纬度对台风的多个重要指标影响的假设检验中.

外文摘要:

In the study of typhoon data, the variety and quantity of data are increasing with the development of computer technology, which has improved the ability of data collecting and storage in recent decades. Thus, statistical analysis has received extensive attention.
Sometimes, we may have to deal with the situation that the number of variables is close to the sample size, in which the traditional asymptotic theory can't be applied to obtain a consistent estimator. Recently, many statistical methods have been developed for model prediction and variable selection. However, hypothesis test for high-dimensional regression models has always been a challenging task. Former test statistics are mostly constructed case by case. Due to the complexity of statistical model in researches on typhoon data, the specialized construction will be more difficult. Thus, the first part of this study proposes a unified approach for testing low-dimensional functional in high dimensional regression model by using one-step regularized estimator(OSRE). Then we also prove the asymptotic normality of the OSRE. Besides, we apply our method to high dimensional linear regression model and nonparametric additive model to show the universality and reasonableness. The simulation study shows that OSREs perform well with finite sample. At last, a microarray data example is presented to demonstrate the performance of the proposed method.

We may face with a special class of high-dimensional data, functional data, in the field typhoon researches. This kind of data is mainly recorded as figures or curves. However, traditional functional data analysis can't catch the location information of tropical cyclone. Therefore, we proposed a new model in which coefficients vary with the location of tropical cyclone instead of the time. This makes our method more suitable for the actual situation. Besides, we can also apply this method to some complex systems to explore the inner mechanism of the system. The asymptotic properties are presented and verified by simulation studies. Lastly, we applied our method to the study of wind speed observed by different stations while a tropical cyclone arrives and obtain good prediction results.
There exits a special class of hypothesis tests which is called the exclusive hypothesis test. A hypothesis test is EHT if the null hypothesis can be divided into a set of exclusive sub-hypotheses. This idea has originated in the study of genetic pleiotropy, and we extend the idea to the typhoon data. Because traditional p-value isn't suitable for the kind of question, we propose a weighted p-value which can be used as traditional p-value in hypothesis test. In order to determine the corresponding weights,  we develop two methods, one likelihood-based and the other BIC-based. Furthermore, we show that the BIC-based method can control the asymptotic type I error.  We conduct an extensive simulation study of these two proposed methods, which suggests that they work well in practice. Our proposed methodology is then applied to a set of data concerning tropical storms.

参考文献总数:

 101    

馆藏地:

 图书馆学位论文阅览区(主馆南区三层BC区)    

馆藏号:

 博0714Z2/21003    

开放日期:

 2022-06-24    

无标题文档

   建议浏览器: 谷歌 360请用极速模式,双核浏览器请用极速模式