- 无标题文档
查看论文信息

中文题名:

 网络数据预测变量单指标模型的统计分析与应用    

姓名:

 张家悦    

保密级别:

 公开    

论文语种:

 chi    

学科代码:

 071201    

学科专业:

 统计学    

学生类型:

 学士    

学位:

 理学学士    

学位年度:

 2023    

校区:

 北京校区培养    

学院:

 统计学院    

第一导师姓名:

 李高荣    

第一导师单位:

 统计学院    

提交日期:

 2023-06-09    

答辩日期:

 2023-05-05    

外文题名:

 STATISTICAL ANALYSIS OF A SINGLE-INDEXMODEL USING NETWORK DATA AS PREDICTIVE VARIABLES    

中文关键词:

 单指标模型 ; 节点 ; 网络数据 ; 稀疏性 ; 协同效应    

外文关键词:

 Single-index model ; Node ; Network data ; Sparsity ; Synergistic effect    

中文摘要:

网络数据是一种新兴的非结构化数据,在互联网、金融、生物等行业受到极大关注。网络数据作为高维数据的一种形式,可对其建立单指标模型,运用于回归分析之中,以深入挖掘网络数据背后的逻辑,具有重要的理论意义和应用价值。单指标模型是对高维数据建模的常用手段,被广泛应用于生物医学、数量金融等研究背景中,能够在保留模型可解释性的条件下增添更多的灵活性,并避免“维度灾祸”问题。
本文提出以网络数据为预测变量的单指标模型的统计分析,将网络数据节点间的协同效应和响应变量通过未知的连接函数相联系,同时利用协同效应的个体信息和数据的整体网络结构,捕捉数据之间潜藏的非线性关系,同时通过引入稀疏性约束和具有重叠组的组Lasso估计方法,筛选出对响应变量有显著影响的节点和相应的协同效应,并通过模拟研究和实际数据分析对所提方法和算法进行验证,证明了所提出模型的有效性。
本文在总结与利用现有研究成果的基础上,从网络数据节点间协同效应的角度切入,运用单指标模型进行回归分析,在理论层面进一步丰富相关理论研究,并提出一套完整的算法,挖掘网络数据的深层逻辑,进一步释放网络数据的动能,同时为高维数据在实际应用提供支持。

外文摘要:

Network data is an emerging kind of unstructured data, which has received great attention in internet, finance, biology and other industries. As a form of high-dimensional data, network data can be modeled as a single-index model and applied in regression analysis to deeply explore the logic behind network data, which has important theoretical significance and application value. Single-index models are a common tool for modeling high-dimensional data, and are widely used in biomedical, quantitative and financial research contexts to add more flexibility while retaining model interpretability and avoiding the "dimensional disaster" problem.
In this paper, we propose a statistical analysis and application of single-index model with network data as predictor variables, linking the synergistic effects between network data nodes and response variables through unknown linkage functions, while capturing the latent nonlinear relationships between data by using individual information of synergistic effects and the overall network structure of data, and at the same time, by introducing sparsity constraints and group Lasso estimation method with overlapping groups, we filter the proposed method and algorithm are validated by simulation studies and actual data analysis, and the effectiveness of the proposed model is demonstrated.
Based on the summary and utilization of the existing research results, this paper uses the single-index model for regression analysis from the perspective of synergistic effects among network data nodes to further enrich the relevant theoretical research at the theoretical level, and proposes a complete set of algorithms to explore the deep logic of network data and further release the kinetic energy of network data, while providing support for the application of high-dimensional data in practice through the application of the single-indicator model. We also provide support for the application of high-dimensional data in practice through the application of single indicator models.

参考文献总数:

 36    

馆藏号:

 本071201/23026    

开放日期:

 2024-06-08    

无标题文档

   建议浏览器: 谷歌 360请用极速模式,双核浏览器请用极速模式