中文题名: | 双模态网络的协变量降维与社区发现 |
姓名: | |
保密级别: | 公开 |
论文语种: | chi |
学科代码: | 071400 |
学科专业: | |
学生类型: | 硕士 |
学位: | 理学硕士 |
学位类型: | |
学位年度: | 2023 |
校区: | |
学院: | |
研究方向: | 统计理论及应用 |
第一导师姓名: | |
第一导师单位: | |
提交日期: | 2023-06-19 |
答辩日期: | 2023-05-12 |
外文题名: | DIMENSIONAL REDUCTION OF COVARIATES AND COMMUNITY DETECTION FOR BIMODAL NETWORK |
中文关键词: | |
外文关键词: | Bimodal network ; Dimensional reduction ; Community detection ; Stochastic block model ; Recommendation system |
中文摘要: |
随着现代科学技术的飞速发展,越来越多的数据信息在社会生产生活实践中日渐积累。网络结构数据作为一种重要的数据存储方式,在互联网等科技领域发挥着重要作用,也随着网络的发展形式也逐渐复杂。网络数据描述节点之间的连接关系,在各种科学领域都被人们所研究,例如生物学中的遗传网络、社会学中的社交网络、互联网中的推荐系统等等。如何准确高效地从网络结构数据中提取信息是我们面前的一大挑战。 本文选择了一类在当前网络数据的统计研究中较少出现的网络形式,也即双模态网络。本文所研究的网络具有三个特性:一是在网络结构上双模态网络呈现出有两类不同节点,而网络中的连边仅仅出现在两类不同的节点之中;二是这个网络还携带有节点各自的协变量信息;三是每个模态内部存在有各自的社区结构。 本文从对双模态网络的降维问题出发,为每个模态的节点数据找到对应的投影矩阵,将两个不同模态的节点映射到同一个低维的隐空间后,通过定义距离函数,构建了可以通过SVD方法求解的模型。本文通过联合考虑两模态数据之间的网络连接关系以及节点自身所携带的协变量信息来对数据进行降维处理,从而获得对该网络结构数据的直观认识;此外,本文在传统基于单模网络的随机块模型推广到了了双模态网络的情形下,通过引入模块社区间的连接概率矩阵,在理论上进一步分析了本方法和传统降维方法的联系,并且得到了在不同数据结构下本方法结果与传统降维方法的联系与区别。 本文在最后通过一系列仿真模拟试验对影响本方法的若干参数进行了讨论,试验结果说明了本方法在双模网络存在社区结构时将会得到较好的分类效果。此外本文还对一个网络平台用户——商品的点击转化数据集进行分析,在降维之后找出了用户和商品各自的社区分类。 |
外文摘要: |
With the quick advancement of contemporary science and technology, a growing amount of data and information are being gathered every day in social production and day-to-day living. In addition to playing a significant role in the Internet and other scientific and technical domains, network structure data also continuously gets more complicated as the network form develops. Network data, which defines the connections between nodes, has been researched in a variety of scientific domains, including biology and sociology, as well as social networks and Internet recommendation services. A significant difficulty we have is how to reliably and effectively extract information from network structured data. In this study, we select a type of networks called bimodal networks that is less commonly observed in the existing statistical investigations of network data. The network under study in this research has three features: first, the bimodal network shows two distinct types of nodes in the network structure, and the connected edges in the network only exist in the two different types of nodes; second, the network also includes information about the covariates of the nodes; and third, there is a community structure inside each mode. The dimensional reduction of bimodal networks is the starting point of this study. From there, the projection matrix for each mode's node data is found, the nodes of two different modes are mapped into a single low-dimensional hidden space, and finally a model that can be solved by the SVD method by defining the distance function is built. The standard random block model based on a unimodal network is also extended to the case of a bimodal network in this research. Additionally, the connection probability matrix between modular communities is introduced in order to better evaluate the theoretical features of this technique. The relationship between this technique and the traditional dimensional reduction technique is theoretically further examined. Also, we found out the difference and similarities between the results of this network-supervised dimensional reduction method and traditional dimensional reduction methods under several different data structure. The research finishes with a series of simulated tests to analyze various factors influencing the approach. The findings show that the method performs better when community structure is present in bimodal networks. In addition, this study examines a user-commodity click conversion dataset from a web platform and determines, after dimensionality reduction, the appropriate community categorization of users and products. |
参考文献总数: | 28 |
馆藏号: | 硕071400/23010 |
开放日期: | 2024-06-18 |