[1]赵朋亚,傅湘玲,仵伟强,等.基于标签传播的协同分类欺诈检测方法[J].深圳大学学报理工版,2020,37(5):482-489.[doi:10.3724/SP.J.1249.2020.05482]
 ZHAO Pengya,FU Xiangling,et al.Collective classification method based on label propagation for fraud detection[J].Journal of Shenzhen University Science and Engineering,2020,37(5):482-489.[doi:10.3724/SP.J.1249.2020.05482]
点击复制

基于标签传播的协同分类欺诈检测方法()
分享到:

《深圳大学学报理工版》[ISSN:1000-2618/CN:44-1401/N]

卷:
第37卷
期数:
2020年第5期
页码:
482-489
栏目:
电子与信息科学
出版日期:
2020-09-15

文章信息/Info

Title:
Collective classification method based on label propagation for fraud detection
文章编号:
202005005
作者:
赵朋亚12傅湘玲12仵伟强23李达23高嵩峰23
1)北京邮电大学计算机学院(国家示范性软件学院),北京邮电大学可信分布式计算与服务教育部重点实验室,北京 100876
2)北邮-华融智慧金融联合实验室,北京 100876
3)华融融通(北京)科技有限公司,北京 100033
Author(s):
ZHAO Pengya1 2 FU Xiangling1 2 WU Weiqiang2 3 LI Da2 3 and GAO Songfeng2 3
1) School of Computer Science (National Pilot Software Engineering School), Key Laboratory of Trustworthy Distributed Computing and Service (BUPT), Ministry of Education, Beijing University of Posts and Telecommunications, Beijing 100876, P.R.China
2) BUPT and Huarong Joint Lab of Smart Finance, Beijing 100876, P.R.China
3) Huarong Rongtong (Beijing) Technology Co., Ltd., Beijing 100033, P.R.China
关键词:
计算机软件欺诈检测 协同分类 网络借贷 标签传播机器学习
Keywords:
computer softwarefraud detection collective classification online lending label propagation machine learning
分类号:
TP311;TP312
DOI:
10.3724/SP.J.1249.2020.05482
文献标志码:
A
摘要:
网络借贷领域中的欺诈检测是根据收集到的用户历史交易数据等信息,来判断该用户是欺诈用户还是正常用户.现有方法认为用户是独立存在的,忽略了用户之间的关联信息.考虑到目前欺诈逐渐成为群体行为,在欺诈网络内呈现出欺诈节点与非欺诈节点关联稀疏,而欺诈节点间关联紧密的现象,提出基于标签传播的协同分类欺诈检测方法.通过收集真实网上借贷公司的用户通话数据,构建用户之间的通话关联网络,利用标签传播算法扩散欺诈节点的标签信息,确定未知标签节点是否为欺诈用户.通过对权重进行幂操作,改进了标签传播算法中概率转移矩阵的初始化方法,使其适应欺诈场景下正负样本分布不平衡的现象.在有标签样本比例极低且训练样本分布不均衡的真实借贷数据集中进行了7次测试,采用所提算法检测到欺诈用户的精确率最高达17%,所得F1值与精确率都比经典的WvRn算法更优.
Abstract:
In the field of online lending, the key problem for fraud detection is how to judge whether the user is a fraudster or a normal user based on the collected historical transaction data of the user. At present, the representative research methods treat any user as an independent node and ignore the related information among users. Considering that the fraud is gradually becoming a group behavior, the relationships among fraud nodes and non-fraud nodes are sparse in social networks, and the relationships among fraud nodes are closely related, we propose a collective classification fraud detection method with label propagation. A call-records-based user association network is constructed based on the phone call records between users of online lending company, and we use the label propagation algorithm to spread the label information of fraud node to determine whether the unlabeled node is a fraudulent user. In addition, we improve the initialization method of transition probability matrixin label propagation algorithm by the operation of weights powering to avoid the performance degradation of label propagation algorithm caused by the unbalanced distribution of fraud data. Finally, the validation experiment is conducted in a real loan data set with a very low proportion of labeled samples and unbalanced training sample distribution. By using the proposed method in this article, the accuracy rate of fraud user detection reaches 17%, and the F1 value and accuracy rate are both better than those of the classic WvRn algorithm.

参考文献/References:

[1] CALDERA J, HAIN J M, SHERLOCK K. Enhanced automated anti-fraud and anti-money-laundering payment system: U. S. Patent Application 14/846, 169[P]. 2016-03-10.
[2] SARNO R, DEWANDONO R D, AHMAD T, et al. Hybrid association rule learning and process mining for fraud detection[J]. IAENG International Journal of Computer Science, 2015, 42(2): 59-72.
[3] WANG Hao, WANG Zonghu, ZHANG Bin, et al. Information collection for fraud detection in P2P financial market[C]// The 2nd International Conference on Material Engineering and Advanced Manufacturing Technology.[S. l.]: EDP Sciences, 2018, 189: 06006.
[4] FERNANDEZ A. Artificial intelligence in financial services[J]. Banco de Espana Article, 2019, 7: 19.
[5] AHMED M, MAHMOOD A N, ISLAM M R. A survey of anomaly detection techniques in financial domain[J]. Future Generation Computer Systems, 2016, 55: 278-288.
[6] BAESENS B, Van VLASSELAER V, VERBEKE W. Fraud analytics using descriptive, predictive, and social network techniques: a guide to data science for fraud detection[M]. Hoboken, USA: John Wiley & Sons, Inc, 2015.
[7] ITOO F, Meenakshi, SINGH S. Comparison and analysis of logistic regression, Nave Bayes and kNN machine learning algorithms for credit card fraud detection[J]. International Journal of Information Technology. (2020-02-15). https://doi.org/10.1007/s41870-020-00430-y.
[8] SINGH N, LAI K H, VEJVAR M, et al. Data-driven auditing: a predictive modeling approach to fraud detection and classification[J]. Journal of Corporate Accounting & Finance, 2019, 30(3): 64-82.
[9] CARNEIRO N, FIGUEIRA G, COSTA M. A data mining based system for credit-card fraud detection in e-tail[J]. Decision Support Systems, 2017, 95: 91-101.
[10] FU Kang, CHENG Dawei, TU Yi, et al. Credit card fraud detection using convolutional neural networks[C]// Proceedings of the 23th International Conference on Neural Information Processing. Kyoto, Japan: Springer, 2016: 483-490.
[11] ZAKARYAZAD A, DUMAN E. A profit-driven artificial neural network (ANN) with applications to fraud detection and direct marketing[J]. Neurocomputing, 2016, 175: 121-131.
[12] ZHANG Zhaohui, ZHOU Xinxin, ZHANG Xiaobo, et al. A model based on convolutional neural network for online transaction fraud detection[J]. Security and Communication Networks, 2018(2): 1-9.
[13] SAVE P, TIWAREKAR P, JAIN K N, et al. A novel idea for credit card fraud detection using decision tree[J]. International Journal of Computer Applications, 2017, 161(13): 6-9.
[14] XUAN Shiyang, LIU Guanjun, LI Zhenchuan, et al. Random forest for credit card fraud detection[C]// The 15th International Conference on Networking, Sensing and Control (ICNSC). Zhuhai, China: IEEE, 2018: 1-6.
[15] KHARE N, VISWANATHAN P. Decision tree-based fraud detection mechanism by analyzing uncertain data in banking system[M]// Emerging Research in Data Engineering Systems and Computer Communications. Singapore: Springer, 2020: 79-90.
[16] MACSKASSY S A, PROVOST F. Classification in networked data: a toolkit and a univariate case study[J]. Journal of Machine Learning Research, 2007, 8(1): 935-983.
[17] JENSEN D, NEVILLE J, GALLAGHER B. Why collective inference improves relational classification[C]// Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Seattle, USA: Association for Computing Machinery, 2004: 593-598.
[18] MCDOWELL L K, GUPTA K M, AHA D W. Cautious inference in collective classification[C]// Proceedings of the 22nd National Conference on Artificial Intelligence. Vancouver, Canada: AAAI, 2007: 596-601.
[19] GREGORY S. Finding overlapping communities in networks by label propagation[J]. New Journal of Physics, 2010, 12(10): 103018.
[20] PENG Lu, LIN Rongheng. Fraud phone calls analysis based on label propagation community detection algorithm[C]// IEEE World Congress on Services. San Francisco, USA: IEEE, 2018: 23-24.
[21] CUI Haoyi, LI Qingzhong, LI Hui, et al. Healthcare fraud detection based on trustworthiness of doctors[C]// IEEE International Conference on Trust, Security and Privacy in Computing and Communications. Tianjin, China: IEEE, 2016: 74-81.
[22] KOHLI P, LADICKY L, TORR P H S. Robust higher order potentials for enforcing label consistency[J]. International Journal of Computer Vision, 2009, 82(3): 302-324.
[23] PARK J, BARABSI A L. Distribution of node characteristics in complex networks[J]. Proceedings of the National Academy of Sciences, 2007, 104(46): 17916-17920.
[24] WONG C Y, LIU S, LIU S C, et al. Image contrast enhancement using histogram equalization with maximum intensity coverage[J]. Journal of Modern Optics, 2016, 63(16): 1618-1629.

备注/Memo

备注/Memo:
Received:2019-04-28;Revised:2019-06-04;Accepted:2019-06-06
Foundation:National Natural Science Foundation of China (91546121); National Social Science Foundation of China (16ZDA055)
Corresponding author:Professor FU Xiangling. E-mail: fuxiangling@bupt.edu.cn
Citation:ZHAO Pengya, FU Xiangling, WU Weiqiang, et al. Collective classification method based on label propagation for fraud detection[J]. Journal of Shenzhen University Science and Engineering, 2020, 37(5): 482-489.(in Chinese)
基金项目:国家自然科学基金资助项目(91546121);国家社会科学基金资助项目(16ZDA055)
作者简介:赵朋亚(1995—),北京邮电大学硕士研究生.研究方向:网络表征学习、欺诈检测.E-mail:zpy1101936864@bupt.edu.cn
引文:赵朋亚,傅湘玲,仵伟强,等.基于标签传播的协同分类欺诈检测方法[J]. 深圳大学学报理工版,2020,37(5):482-489.
更新日期/Last Update: 2020-07-26