[1]吴晓林,等.面向分类型矩阵数据的无监督孤立点检测算法[J].深圳大学学报理工版,2019,36(1):33-42.[doi:10.3724/SP.J.1249.2019.01033]
 WU Xiaolin,and CAO Fuyuan,An unsupervised outlier detection algorithm for categorical matrix-object data[J].Journal of Shenzhen University Science and Engineering,2019,36(1):33-42.[doi:10.3724/SP.J.1249.2019.01033]
点击复制

面向分类型矩阵数据的无监督孤立点检测算法()
分享到:

《深圳大学学报理工版》[ISSN:1000-2618/CN:44-1401/N]

卷:
第36卷
期数:
2019年第1期
页码:
33-42
栏目:
电子与信息科学
出版日期:
2019-01-20

文章信息/Info

Title:
An unsupervised outlier detection algorithm for categorical matrix-object data
作者:
吴晓林1 2曹付元1 2
1)山西大学计算机与信息技术学院, 山西太原030006; 2)山西大学计算智能与中文信息处理教育部重点实验室, 山西太原 030006
Author(s):
WU Xiaolin1 2 and CAO Fuyuan1 2
1) School of Computer and Information Technology, Shanxi University, Taiyuan, 030006, Shanxi Province, P.R.China; 2) Key Laboratory of Computational Intelligence and Chinese Information Processing of Ministry of Education, Shanxi University, Taiyuan, 030006, Shanxi Province, P.R.China
关键词:
人工智能孤立点检测分类型矩阵数据耦合度内聚度数据挖掘
Keywords:
artificial intelligence outlier detection categorical matrix-object data coupling degree cohesion degreedata mining
分类号:
TP 311
DOI:
10.3724/SP.J.1249.2019.01033
摘要:
孤立点检测是数据挖掘中的一个重要分支,旨在发现一个数据集中与多数对象行为明显不同的一些对象.本研究针对分类型矩阵数据,通过给出一种矩阵对象自身的内聚度和该矩阵对象与其他矩阵对象之间的耦合度,定义了矩阵对象的孤立因子,提出了一种面向分类型矩阵数据的孤立点检测算法.在Market basket、Microsoft web和MovieLens真实数据集上的实验结果表明,与基于共同近邻孤立因子(common-neighbor-based, CNB)算法、局部异常因子(local outlier factor, LOF)算法和基于信息熵(information entropy-based, IE-based)的算法相比,提出的算法能够有效地检测出分类型矩阵数据中的孤立点.
Abstract:
Outlier detection is an important branch in data mining which aims to discover objects in a data set that are significantly different from other most objects. In this paper, by defining the cohesion degree of a matrix-object itself and the coupling degree with other matrix-objects, we define the outlier factor of a matrix-object, and then propose an outlier detection algorithm for categorical matrix-object data. Experimental results on real data sets of Market basket, Microsoft web, and MovieLens have shown that the proposed algorithm can effectively detect outliers for the matrix-object data set compared with common-neighbor-based (CNB)、local outlier factor (LOF) and information entropy-based (IE-based) algorithms.

相似文献/References:

[1]潘长城,徐晨,李国.解全局优化问题的差分进化策略[J].深圳大学学报理工版,2008,25(2):211.
 PAN Chang-cheng,XU Chen,and LI Guo.Differential evolutionary strategies for global optimization[J].Journal of Shenzhen University Science and Engineering,2008,25(1):211.
[2]骆剑平,李霞.求解TSP的改进混合蛙跳算法[J].深圳大学学报理工版,2010,27(2):173.
 LUO Jian-ping and LI Xia.Improved shuffled frog leaping algorithm for solving TSP[J].Journal of Shenzhen University Science and Engineering,2010,27(1):173.
[3]蔡良伟,李霞.基于混合蛙跳算法的作业车间调度优化[J].深圳大学学报理工版,2010,27(4):391.
 CAI Liang-wei and LI Xia.Optimization of job shop scheduling based on shuffled frog leaping algorithm[J].Journal of Shenzhen University Science and Engineering,2010,27(1):391.
[4]张重毅,刘彦斌,于繁华,等.CDA市场环境模型进化研究[J].深圳大学学报理工版,2010,27(4):413.
 ZHANG Zhong-yi,LIU Yan-bin,YU Fan-hua,et al.Research on the evolution model of CDA market environment[J].Journal of Shenzhen University Science and Engineering,2010,27(1):413.
[5]姜建国,周佳薇,郑迎春,等.一种双菌群细菌觅食优化算法[J].深圳大学学报理工版,2014,31(1):43.[doi:10.3724/SP.J.1249.2014.01043]
 Jiang Jianguo,Zhou Jiawei,Zheng Yingchun,et al.A double flora bacteria foraging optimization algorithm[J].Journal of Shenzhen University Science and Engineering,2014,31(1):43.[doi:10.3724/SP.J.1249.2014.01043]
[6]蔡良伟,刘思麒,李霞,等.基于蚁群优化的正则表达式分组算法[J].深圳大学学报理工版,2014,31(3):279.[doi:10.3724/SP.J.1249.2014.03279]
 Cai Liangwei,Liu Siqi,Li Xia,et al.Regular expression grouping algorithm based on ant colony optimization[J].Journal of Shenzhen University Science and Engineering,2014,31(1):279.[doi:10.3724/SP.J.1249.2014.03279]
[7]宁剑平,王冰,李洪儒,等.递减步长果蝇优化算法及应用[J].深圳大学学报理工版,2014,31(4):367.[doi:10.3724/SP.J.1249.2014.04367]
 Ning Jianping,Wang Bing,Li Hongru,et al.Research on and application of diminishing step fruit fly optimization algorithm[J].Journal of Shenzhen University Science and Engineering,2014,31(1):367.[doi:10.3724/SP.J.1249.2014.04367]
[8]刘万峰,李霞.车辆路径问题的快速多邻域迭代局部搜索算法[J].深圳大学学报理工版,2015,32(2):196.[doi:10.3724/SP.J.1249.2015.02000]
 Liu Wanfeng,and Li Xia,A fast multi-neighborhood iterated local search algorithm for vehicle routing problem[J].Journal of Shenzhen University Science and Engineering,2015,32(1):196.[doi:10.3724/SP.J.1249.2015.02000]
[9]蔡良伟,程璐,李军,等.基于遗传算法的正则表达式规则分组优化[J].深圳大学学报理工版,2015,32(3):281.[doi:10.3724/SP.J.1249.2015.03281]
 Cai Liangwei,Cheng Lu,Li Jun,et al.Regular expression grouping optimization based on genetic algorithm[J].Journal of Shenzhen University Science and Engineering,2015,32(1):281.[doi:10.3724/SP.J.1249.2015.03281]
[10]王守觉,鲁华祥,陈向东,等.人工神经网络硬件化途径与神经计算机研究[J].深圳大学学报理工版,1997,14(1):8.
 Wang Shoujue,Lu Huaxiang,Chen Xiangdong and Zeng Yujuan.On the Hardware for Artificial Neural Networks and Neurocomputer[J].Journal of Shenzhen University Science and Engineering,1997,14(1):8.

更新日期/Last Update: 2019-01-30