一种主动半监督大规模网络结构发现算法

1)河北地质大学信息工程学院,河北石家庄 050031; 2)河北中医学院教务处,河北石家庄 050200

计算机应用; 大规模网络; 半监督聚类; 主动学习; 在线变分期望最大算法; 成对约束

An active semi-supervised structure exploring algorithm for large networks
CHAI Bianfang1, CAO Xinyu1, WEI Chunli1, and WANG Jianling2

1)School of Information Engineering, Hebei GEO University, Shijiazhuang 050031, Hebei Province, P.R.China 2)Office of Educational Administration, Hebei University of Chinese Medicine, Shijiazhuang 050200, Hebei Province, P.R.China

computer application; large networks; semi-supervised clustering; active learning; online variational expectation maximization(onlineVEM)algorithm; pairwise constraints

DOI: 10.3724/SP.J.1249.2020.03243

备注

在线变分期望最大(online variational expectation maximization, onlineVEM)算法可快速发现大规模网络的聚类模式,但在网络结构复杂时算法的处理结果稳定性和准确性欠佳.为更快更准地识别其聚类模式,提出一种主动半监督在线变分期望最大(active semi-supervised onlineVEM, ASonlineVEM)算法.算法首先自动选择代表节点,确定类的个数,并基于代表节点初始化模型; 然后迭代执行3个任务:运行在线算法onlineVEM、主动选节点及模型更新,直至算法达到准确率的设定阈值或收敛.在不同结构的人工网络和真实网络上的实验结果表明,ASonlineVEM算法的准确性和效率均优于同类算法.ASonlineVEM算法利用主动选择的节点先验信息提高了网络聚类模式发现的稳定性及准确性,提高了在线算法的运行效率.

The algorithm of online variational expectation maximization(onlineVEM)can explore the clustering patterns of large networks fast. But the stability and accuracy of the algorithm are poor when the network structure is complex. In order to identify the clustering patterns faster and more accurately, an active semi-supervised online variational expectation maximization(ASonlineVEM)algorithm is proposed. Firstly, the algorithm selects the representative nodes automatically, determines the numbers of clusters, and initializes the model based on the representative nodes. Then, it iteratively executes three tasks: running the online algorithm onlineVEM, actively selecting nodes, and updating parameters until the algorithm reaches the preset threshold of accuracy or convergences. Experiments on artificial networks and real networks with different structures show that the accuracy and efficiency of ASonlineVEM algorithm are better than those of similar algorithms. The ASonlineVEM algorithm uses the priori information of actively selected nodes to improve the stability and accuracy of clustering pattern detection of networks and to improve the efficiency of online algorithm.

·