- 无标题文档
查看论文信息

中文题名:

 复杂网络中社团结构最优分类数研究    

姓名:

 李志芳    

学科代码:

 081103    

学科专业:

 系统工程    

学生类型:

 硕士    

学位:

 工学硕士    

学位年度:

 2011    

校区:

 北京校区培养    

学院:

 管理学院    

研究方向:

 复杂网络    

第一导师姓名:

 樊瑛    

第一导师单位:

 北京师范大学管理学院    

提交日期:

 2011-06-14    

答辩日期:

 2011-05-20    

外文题名:

 Detecting the Optimal Number of Communities in Complex Networks    

中文摘要:
社团结构是复杂网络的一个重要特征,同时也是复杂网络研究的热点问题。社团结构是指网络中的顶点可以分成若干组,组内顶点之间连接比较稠密,而组间顶点之间连接比较稀疏。寻找网络社团结构并对其进行分析是了解现实生活中各种网络组织结构的一种重要方法,已经在社会学、计算机科学以及生物学等领域有了广泛的应用。关于复杂网络社团结构的研究主要关注两方面:一是寻找有效划分社团结构的方法;二是研究社团结构的显著性等特征。目前在寻找网络社团结构划分算法方面已经取得了丰硕的成果。许多探索性的算法都能够比较有效地划分出复杂网络的社团结构。然而,通过这些算法却很难直接获得社团结构划分的最优分类数。探索复杂网络社团结构的最优分类数应当是社团分类算法所追求的一个目标,它也可以反过来指导和评价社团分类算法,同时,它还可以帮助我们分析网络社团结构的特征。本文在对已有社团结构研究成果进行简要概述的基础上,重点研究复杂网络社团结构划分的最优分类数问题,提出了一种相对简单有效的探求社团结构最优分类数的方法。这种方法主要基于信息熵以及已有的社团划分方法。对于一个给定的网络,我们首先选取一种已有的算法对其进行社团划分,并得到一系列划分结果;其次,利用信息熵公式来衡量算法的精确度;最后,提出一个指标 ,通过计算 并且分析计算结果的特征,我们能够得到该网络社团结构的最优分类数。在经典人造网络和实际网络的应用结果表明,该方法能够有效地找到社团结构的最优分类数,尤其是在网络社团数量远远小于网络中顶点数量的情况下。在探究社团最优分类数的过程中我们还发现了一个有趣的现象:对于大多数复杂网络社团结构算法来说,大社团往往比小社团更容易被识别。本方法的一个突出优点是不需要事先知道网络的真实社团结构;另一优点是该方法应用广泛且简便,几乎可以应用于所有随机性算法基础之上,且不会增加算法的计算复杂度。本论文还对社团结构显著性问题进行了讨论。简单来说,网络社团结构的显著性就是指网络中的社团结构是否明显。已有的对社团结构显著性的研究大多基于模块化函数之上。认为模块化函数值大的社团划分具有更显著的社团结构。然而,很多研究已经证明模块化函数存在一些不足。本文在探索社团最优分类数方法的基础上对社团结构显著性进行了一些探索性的研究。
外文摘要:
Community structure is an important property of complex networks and much attention has been focused on this issue. Communities or modules mean high concentrations of edges within special groups of vertices, and low concentrations between these groups. Detecting and analyzing the community structure of complex networks is an important method to understand the internal structure of real networks. This method has been widely used in sociology, computer science and biology etc. The general aim of community detection is to find meaningful divisions into groups by investigating the structural properties of the whole graph. There are two major aspects about this problem. One concerns with proposing effective algorithms for detecting the communities, and the other concerns with the significance or robustness of the obtained divisions. For the first aspect, many efficient heuristic methods have been proposed over the years to detect the communities in networks, in particular those based on spectral methods, divisive algorithms, modularity-based methods, dynamic algorithms, and many others. However, most existing algorithms are not able to get the optimal number directly. Communities are just the final product of the algorithm. Detecting the optimal number of communities is important for every community detection algorithms. It can help us to detecting and analyzing the community structure.In this paper, some former results are reviewed at first. But our research focuses on detecting the optimal number of communities. Based on the normalized mutual information index, which has been used as a measure for similarity of communities, an effective and simple method was proposed. For a given network, we first use an existing algorithm to induce a sequence of divisions into communities. Second, we measure the precision of the algorithm based on “information entropy”. At last, we propose an index to find the optimal number of the communities. Numerical and empirical results show that the index is effective in both artificial and real world networks, especially when the number of communities is much smaller than the number of vertices. We have found a very interesting phenomenon that large community structures are more likely to be identified by the algorithms. A point we should mention is that our method needn’t to know the “real” community structures in advance. Moreover, the method can be applied to almost all the algorithms with random characteristics to help them find the optimal community number. It is relatively simple and will not increase the complexity of the algorithms, which just repeat the algorithm for several times.Significance of community structure is another problem we focused in this paper. Most existing researches on this problem are based on modularity function, which consider that high modularity function means high significance. However, several researches have shown that the function has limitations. In this paper, based on the method for detecting the optimal community number, we have done some research on the significance of community structure.
参考文献总数:

 61    

作者简介:

 本人李志芳,就读于北京师范大学管理学院系统科学系系统工程专业。硕士研究生阶段曾参与过国家自然科学基金面上项目等科研课题,并在Physica A发表英文学术论文一篇(SCI收录)。    

馆藏号:

 硕081103/1101    

开放日期:

 2011-06-14    

无标题文档

   建议浏览器: 谷歌 360请用极速模式,双核浏览器请用极速模式