- 无标题文档
查看论文信息

中文题名:

 大规模多重检验的重叠系数比和拓展局部FDR方法及其应用    

姓名:

 余韵涵    

保密级别:

 公开    

论文语种:

 中文    

学科代码:

 071201    

学科专业:

 统计学    

学生类型:

 学士    

学位:

 理学学士    

学位年度:

 2022    

学校:

 北京师范大学    

校区:

 北京校区培养    

学院:

 统计学院 ; 国民核算研究院    

第一导师姓名:

 李高荣    

第一导师单位:

 北京师范大学统计学院    

提交日期:

 2022-06-18    

答辩日期:

 2022-05-06    

外文题名:

 Overlap coefficient ratios and extended local FDR methods for large-scale multiple tests and their applications    

中文关键词:

 大规模多重检验 ; 复合决策理论 ; FDR ; 局部FDR ; 重叠系数比    

外文关键词:

 Large-scale multiple tests ; Compound decision theory ; FDR ; Overlap coefficient ratio ; Extended local FDR    

中文摘要:

近年来,随着机器学习、深度学习、神经网络等技术和方法的蓬勃发展,当前学界拥有了更多的工具可以对生物信息和计算化学领域的问题进行探索。由此在生物信息和计算化学领域生成了一类问题,是如何对深度学习等模型预测的相关得分或光谱数据得到的统计指标进行判别分析和分类。

针对此类问题,本文提出了两个分别基于重叠系数比(OVLR)和拓展的局部FDR (Lfdr-E)的大规模多重检验方法。从理论上的创新来说,这两个提出的方法,融合了研究单元方差信息,使用复合决策理论的检验程序,从统计学视角上建立对此类问题进行推断的统一框架。从应用的效果来看,模拟研究和实际数据分析都说明,这两种方法在控制FDR水平的情况下,有效的减少了信息的损失,提高了多重检验的功效。值得一提的是,本文提出的方法,不仅仅适用于文中所提到的研究,可以相当普遍的被应用。

外文摘要:

In recent years, with the boom in techniques, such as machine learning, deep learning, and neural networks, the current academic community has developed more tools to explore problems in bioinformatics and computational chemistry. It leads to one particular class of issues in these fields: how to perform discriminant analysis and classification of statistical indicators and predicted scores obtained from relevant research.

   To address such problems, two large-scale multiple testing methods based on the overlap coefficient ratio (OVLR) and the extended local FDR (Lfdr-E), respectively, are put forward in this paper. In terms of theoretical innovations, these two proposed methods, which incorporate information on the variance of the study unit, use the test procedure of composite decision theory to establish a unified statistical inference framework for such problems. When it comes to the effectiveness of the application, both the simulation study and the actual data analysis illustrate that the two methods are effective in reducing the loss of information and improving the power of multiple testing while controlling for the level of FDR. It is worth mentioning that the method proposed in this paper is not only applicable to the studies mentioned in this paper but can be applied quite generally.

参考文献总数:

 34    

插图总数:

 8    

插表总数:

 6    

馆藏号:

 本071201/22002    

开放日期:

 2023-06-18    

无标题文档

   建议浏览器: 谷歌 360请用极速模式,双核浏览器请用极速模式