ZHAO Pengya,FU Xiangling,et al.Collective classification method based on label propagation for fraud detection[J].Journal of Shenzhen University Science and Engineering,2020,37(5):482-489.[doi:10.3724/SP.J.1249.2020.05482]





Collective classification method based on label propagation for fraud detection
1)北京邮电大学计算机学院(国家示范性软件学院),北京邮电大学可信分布式计算与服务教育部重点实验室,北京 100876
2)北邮-华融智慧金融联合实验室,北京 100876
3)华融融通(北京)科技有限公司,北京 100033
ZHAO Pengya1 2 FU Xiangling1 2 WU Weiqiang2 3 LI Da2 3 and GAO Songfeng2 3
1) School of Computer Science (National Pilot Software Engineering School), Key Laboratory of Trustworthy Distributed Computing and Service (BUPT), Ministry of Education, Beijing University of Posts and Telecommunications, Beijing 100876, P.R.China
2) BUPT and Huarong Joint Lab of Smart Finance, Beijing 100876, P.R.China
3) Huarong Rongtong (Beijing) Technology Co., Ltd., Beijing 100033, P.R.China
计算机软件欺诈检测 协同分类 网络借贷 标签传播机器学习
computer softwarefraud detection collective classification online lending label propagation machine learning
In the field of online lending, the key problem for fraud detection is how to judge whether the user is a fraudster or a normal user based on the collected historical transaction data of the user. At present, the representative research methods treat any user as an independent node and ignore the related information among users. Considering that the fraud is gradually becoming a group behavior, the relationships among fraud nodes and non-fraud nodes are sparse in social networks, and the relationships among fraud nodes are closely related, we propose a collective classification fraud detection method with label propagation. A call-records-based user association network is constructed based on the phone call records between users of online lending company, and we use the label propagation algorithm to spread the label information of fraud node to determine whether the unlabeled node is a fraudulent user. In addition, we improve the initialization method of transition probability matrixin label propagation algorithm by the operation of weights powering to avoid the performance degradation of label propagation algorithm caused by the unbalanced distribution of fraud data. Finally, the validation experiment is conducted in a real loan data set with a very low proportion of labeled samples and unbalanced training sample distribution. By using the proposed method in this article, the accuracy rate of fraud user detection reaches 17%, and the F1 value and accuracy rate are both better than those of the classic WvRn algorithm.


[1] CALDERA J, HAIN J M, SHERLOCK K. Enhanced automated anti-fraud and anti-money-laundering payment system: U. S. Patent Application 14/846, 169[P]. 2016-03-10.
[2] SARNO R, DEWANDONO R D, AHMAD T, et al. Hybrid association rule learning and process mining for fraud detection[J]. IAENG International Journal of Computer Science, 2015, 42(2): 59-72.
[3] WANG Hao, WANG Zonghu, ZHANG Bin, et al. Information collection for fraud detection in P2P financial market[C]// The 2nd International Conference on Material Engineering and Advanced Manufacturing Technology.[S. l.]: EDP Sciences, 2018, 189: 06006.
[4] FERNANDEZ A. Artificial intelligence in financial services[J]. Banco de Espana Article, 2019, 7: 19.
[5] AHMED M, MAHMOOD A N, ISLAM M R. A survey of anomaly detection techniques in financial domain[J]. Future Generation Computer Systems, 2016, 55: 278-288.
[6] BAESENS B, Van VLASSELAER V, VERBEKE W. Fraud analytics using descriptive, predictive, and social network techniques: a guide to data science for fraud detection[M]. Hoboken, USA: John Wiley & Sons, Inc, 2015.
[7] ITOO F, Meenakshi, SINGH S. Comparison and analysis of logistic regression, Nave Bayes and kNN machine learning algorithms for credit card fraud detection[J]. International Journal of Information Technology. (2020-02-15). https://doi.org/10.1007/s41870-020-00430-y.
[8] SINGH N, LAI K H, VEJVAR M, et al. Data-driven auditing: a predictive modeling approach to fraud detection and classification[J]. Journal of Corporate Accounting & Finance, 2019, 30(3): 64-82.
[9] CARNEIRO N, FIGUEIRA G, COSTA M. A data mining based system for credit-card fraud detection in e-tail[J]. Decision Support Systems, 2017, 95: 91-101.
[10] FU Kang, CHENG Dawei, TU Yi, et al. Credit card fraud detection using convolutional neural networks[C]// Proceedings of the 23th International Conference on Neural Information Processing. Kyoto, Japan: Springer, 2016: 483-490.
[11] ZAKARYAZAD A, DUMAN E. A profit-driven artificial neural network (ANN) with applications to fraud detection and direct marketing[J]. Neurocomputing, 2016, 175: 121-131.
[12] ZHANG Zhaohui, ZHOU Xinxin, ZHANG Xiaobo, et al. A model based on convolutional neural network for online transaction fraud detection[J]. Security and Communication Networks, 2018(2): 1-9.
[13] SAVE P, TIWAREKAR P, JAIN K N, et al. A novel idea for credit card fraud detection using decision tree[J]. International Journal of Computer Applications, 2017, 161(13): 6-9.
[14] XUAN Shiyang, LIU Guanjun, LI Zhenchuan, et al. Random forest for credit card fraud detection[C]// The 15th International Conference on Networking, Sensing and Control (ICNSC). Zhuhai, China: IEEE, 2018: 1-6.
[15] KHARE N, VISWANATHAN P. Decision tree-based fraud detection mechanism by analyzing uncertain data in banking system[M]// Emerging Research in Data Engineering Systems and Computer Communications. Singapore: Springer, 2020: 79-90.
[16] MACSKASSY S A, PROVOST F. Classification in networked data: a toolkit and a univariate case study[J]. Journal of Machine Learning Research, 2007, 8(1): 935-983.
[17] JENSEN D, NEVILLE J, GALLAGHER B. Why collective inference improves relational classification[C]// Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Seattle, USA: Association for Computing Machinery, 2004: 593-598.
[18] MCDOWELL L K, GUPTA K M, AHA D W. Cautious inference in collective classification[C]// Proceedings of the 22nd National Conference on Artificial Intelligence. Vancouver, Canada: AAAI, 2007: 596-601.
[19] GREGORY S. Finding overlapping communities in networks by label propagation[J]. New Journal of Physics, 2010, 12(10): 103018.
[20] PENG Lu, LIN Rongheng. Fraud phone calls analysis based on label propagation community detection algorithm[C]// IEEE World Congress on Services. San Francisco, USA: IEEE, 2018: 23-24.
[21] CUI Haoyi, LI Qingzhong, LI Hui, et al. Healthcare fraud detection based on trustworthiness of doctors[C]// IEEE International Conference on Trust, Security and Privacy in Computing and Communications. Tianjin, China: IEEE, 2016: 74-81.
[22] KOHLI P, LADICKY L, TORR P H S. Robust higher order potentials for enforcing label consistency[J]. International Journal of Computer Vision, 2009, 82(3): 302-324.
[23] PARK J, BARABSI A L. Distribution of node characteristics in complex networks[J]. Proceedings of the National Academy of Sciences, 2007, 104(46): 17916-17920.
[24] WONG C Y, LIU S, LIU S C, et al. Image contrast enhancement using histogram equalization with maximum intensity coverage[J]. Journal of Modern Optics, 2016, 63(16): 1618-1629.


Foundation:National Natural Science Foundation of China (91546121); National Social Science Foundation of China (16ZDA055)
Corresponding author:Professor FU Xiangling. E-mail: fuxiangling@bupt.edu.cn
Citation:ZHAO Pengya, FU Xiangling, WU Weiqiang, et al. Collective classification method based on label propagation for fraud detection[J]. Journal of Shenzhen University Science and Engineering, 2020, 37(5): 482-489.(in Chinese)
引文:赵朋亚,傅湘玲,仵伟强,等.基于标签传播的协同分类欺诈检测方法[J]. 深圳大学学报理工版,2020,37(5):482-489.
更新日期/Last Update: 2020-07-26