[1]刘润奇,贺兴时,南夷非,等.网络多媒体数据中舆情关联主题的挖掘方法[J].深圳大学学报理工版,2020,37(1):72-78.[doi:10.3724/SP.J.1249.2020.01072]
 LIU Runqi,HE Xingshi,NAN Yifei,et al.Mining method of public opinion related topic in network multimedia data[J].Journal of Shenzhen University Science and Engineering,2020,37(1):72-78.[doi:10.3724/SP.J.1249.2020.01072]
点击复制

网络多媒体数据中舆情关联主题的挖掘方法()
分享到:

《深圳大学学报理工版》[ISSN:1000-2618/CN:44-1401/N]

卷:
第37卷
期数:
2020年第1期
页码:
72-78
栏目:
电子与信息科学
出版日期:
2020-01-08

文章信息/Info

Title:
Mining method of public opinion related topic in network multimedia data
文章编号:
202001011
作者:
刘润奇1贺兴时1南夷非2王博2
1) 西安工程大学理学院 陕西西安 710048
2)西安交通大学智能网络与网络安全教育部重点实验室 陕西西安 710049
Author(s):
LIU Runqi1 HE Xingshi1 NAN Yifei2 and WANG Bo2
1)School of Science, Xi’an Polytechnic University, Xi’an 710048, Shaanxi Province, P.R.China
2) Ministry of Education Key Lab for Intelligent Networks and Network Security, Xi’an Jiaotong University, Xi’an 710049, Shaanxi Province, P.R.China
关键词:
模式识别图像处理微型博客新浪微博多媒体数据文本检测文本提取主题识别舆情监管
Keywords:
pattern recognition image processing microblog Sina weibo multimedia data text detection text extraction subject recognition public opinion supervision
分类号:
TP393.3
DOI:
10.3724/SP.J.1249.2020.01072
文献标志码:
A
摘要:
如何高效地从图像、视频等多媒体数据中挖掘网络舆情事件的关联主题给网络舆情的有效监管带来了重大挑战.研究图像和视频截图等多媒体数据中文本信息的抽取方法,并在此基础上实现舆情关联主题的检测.选择新浪微博中的3个典型舆情事件为研究对象,设计网络爬虫收集事件中的文本、图像和视频多模态数据;采用连接文本提议网络(connectionist text proposal network, CTPN)的文字检测算法实现文本信息定位,利用DenseNet网络和连接时序分类(connectionist temporal classification, CTC)相结合的方法进行文本提取;提出多粒度潜在狄利克雷分布(multi granularity-latent Dirichlet allocation, MG-LDA)和jieba分词相结合的舆情关联主题提取方法.实验结果表明,所提出的方法可准确提取多媒体数据中不同格式、不同分辨率、不同颜色、不定位置和不同角度的文本信息,为精确把握舆情演化态势提供有力的数据支撑.
Abstract:
Social media has become the platform for rumors rapid propagation, more and more users adopt the pictures and videos to express their opinions in order to avoid being detected by text-based approaches, which has greatly affected the efficiency of online public opinion monitoring. For tackling the above-mentioned problem, this paper mainly studies how to extract the related opinions from the multimedia network data. Firstly, three typical events in Sina weibo are selected as the research targets, in which the web crawler is designed to collect the multimedia data. Secondly, the text detection algorithm based on connectionist text proposal network (CTPN) is employed to perform the text localization, and then a fusion method by combining DenseNet and connectionist temporal classification (CTC) is employed to perform text extraction. Finally, an effective algorithm by combining multi granularity-latent Dirichlet allocation (MG-LDA) and jieba is proposed to accurately identify the related topics from the extracted text. The experimental results show that the proposed method can accurately extract the texts from multimedia with different formats, resolutions and colors, and can also extract the texts with different rotating angles. Our research provides the solid foundations for online public opinion monitoring.

参考文献/References:

[1] 陈子鸥.新媒体时代的舆情管理[N].德宏团结报,2019-05-28(002).
CHEN Ziou. Public opinion management in the new media age [N]. Dehong Tuanjie Bao, 2019-05-28(002).(in Chinese)
[2] 梁冠华,鞠玉梅.基于舆情演化生命周期的突发事件网络舆情风险评估分析[J].情报科学,2018,36(10):48-53.
LIANG Guanhua, JU Yumei. The analysis of public opinion risk assessment of sudden events based on public relation life cycle[J]. Information Science, 2018, 36(10): 38-53.(in Chinese)
[3] 陈建敏,余遵成.涉军网络舆情安全评估指标体系研究[J].图书情报研究,2018,11(1):29-36.
CHEN Jianmin, YU Zuncheng. A study of the indicator system for evaluating the safety of military network public opinions[J]. Library and Information Research, 2018, 11(1): 29-36.(in Chinese)
[4] 丁晟春,王楠,吴靓婵媛.基于关键词共现和社区发现的微博热点主题识别研究[J].现代情报,2018,38(3):10-18.
DING Shengchun, WANG Nan, WU Jingchanyuan. Hot topic detection of weibo based on keyword co-occurrence and community discovery[J]. Journal of Modern Information, 2018, 38(3): 10-18.(in Chinese)
[5] 郑魁,疏学明,袁宏永.网络舆情热点信息自动发现方法[J].计算机工程,2010,36(3):4-6.
ZHENG Kui, SHU Xueming, YUAN Hongyong. Hot spot information auto-detection method of network public opinion[J]. Computer Engineering, 2010, 36(3): 4-6.(in Chinese)
[6] 张艳丰,李贺,彭丽徽.基于直觉模糊推理的网络舆情监测预警评估方法研究[J].情报杂志,2017,36(6):111-117,172.
ZHANG Yanfeng, LI He, PENG Lihui. Research on network public opinion monitoring and early warning evaluation method based on intuitionistic fuzzy reasoning[J]. Journal of Intelligence, 2017, 36(6): 111-117, 172.(in Chinese)
[7] 杨频,李涛,赵奎.一种网络舆情的定量分析方法[J].计算机应用研究,2009,26(3):272-274,284.
YANG Pin, LI Tao, ZHAO Kui. Quantitative method for analyzing public opinions on Internet[J]. Application Research of Computers, 2009, 26(3): 272-274, 284.(in Chinese)

[8] 陈星宇,周展,莫文俊,等.基于关键词挖掘的客户细分方法[J].深圳大学学报理工版,2017, 34(3): 300-305.
CHEN Xingyu, ZHOU Zhan, HUANG Wenjun, et al. A keyword-based mining method for customer segmentation[J]. Journal of Shenzhen University Science and Engineering,2017, 34(3): 300-305.(in Chinese)
[9] BURKHARDT S, KRAMER S. Online multi-label dependency topic models for text classification[J]. Machine Learning, 2018, 107(5): 859-886.
[10] 邢彪,根绒切机多吉.基于jieba分词搜索与SSM框架的电子商城购物系统[J].信息与电脑,2018(7):104-105.
XING Biao, Genrongqiejiduoji. E-commerce shopping system based on jieba participle search and SSM framework[J]. China Computer & Communication, 2018(7):104-105.(in Chinese)
[11] JELODAR H, WANG Yongli, YUAN Chi, et al. Latent Dirichlet allocation (LDA) and topic modeling: models, applications, a survey[J]. Multimedia Tools and Applications, 2018: 1-43.
[12] GRAVES A, FERNNDEZ S, GOMEZ F, et al. Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks[C]// Proceedings of the 23rd International Conference on Machine Learning. Pittsburgh, USA: ACM, 2006: 369-376.
[13] 张艺玮,赵一嘉,王馨悦,等.结合密集神经网络与长短时记忆模型的中文识别[J].计算机系统应用,2018,27(11):35-41.
ZHANG Yiwei, ZHAO Yijia, WANG Xinyue, et al. Chinese recognition based on dense convolutional network and bidirectional long short-term memory model[J]. Computer Systems & Applications. 2018, 27(11): 35-41.(in Chinese)
[14] HE Tong, HUANG Weilin, QIAO Yu, et al. Accurate text localization in natural image with cascaded convolutional text network[DB/OL]. (2016-05-31)[2019-04-01]. https://arxiv.org/1603.09423.pdf
[15] TIAN Zhi, HUANG Weilin, HE Tong, et al. Detecting text in natural image with connectionist text proposal network[C]// European Conference on Computer Vision. Amsterdam: Springer, 2016: 56-72.
[16] 缪裕青,汪俊宏,刘同来,等.图文融合的微博情感分析方法[J].计算机工程与设计,2019,40(4):1099-1105.
MIAO Yuqing, WANG Junhong, LIU Tonglai, et al. Joint visual-textual approach for microblog sentiment analysis[J]. Computer Engineering and Design, 2019, 40(4): 1099-1105.(in Chinese)
[17] 彭云,万常选,江腾蛟,等.基于语义约束 LDA 的商品特征和情感词提取[J].软件学报,2017,28(3):676-693.
PENG Yun, WAN Changxuan, JIANG Tengjiao, et al. Extracting product aspects and user opinions based on semantic constrained LDA model[J]. Journal of Software, 2017, 28(3): 676-693.(in Chinese)
[18] 钟杏梅.网络舆情传播规律及成因探析[J].公关世界,2017(19):34-39.
ZHONG Xingmei. Analysis of network public opinion on propagation model and cause[J]. Public Relations, 2017(19): 34-39.(in Chinese)

相似文献/References:

[1]张 敏,阮双琛,杨 珺,等.连续太赫兹波实时透射成像实验研究[J].深圳大学学报理工版,2007,24(4):384.
 ZHANG Min,RUAN Shuang-chen,YANG Jun,et al.Experimental study of continuous-wave terahertz radiation real-time transmission imaging[J].Journal of Shenzhen University Science and Engineering,2007,24(1):384.
[2]胡媛媛,牛夏牧.基于视觉阈值的结构相似度图像质量评价算法[J].深圳大学学报理工版,2010,27(2):185.
 HU Yuan-yuan and NIU Xia-mu.Image quality assessment based on human visibility threshold theory and structural similarity[J].Journal of Shenzhen University Science and Engineering,2010,27(1):185.
[3]宋远佳,张炜,杨正伟,等.固体火箭发动机壳体脱黏缺陷的热波检测[J].深圳大学学报理工版,2012,29(No.3(189-282)):252.[doi:10.3724/SP.J.1249.2012.03252]
 SONG Yuan-jia,ZHANG Wei,YANG Zheng-wei,et al.Debond defect detection in shell of solid rocket motor by thermal wave nondestructive testing[J].Journal of Shenzhen University Science and Engineering,2012,29(1):252.[doi:10.3724/SP.J.1249.2012.03252]
[4]黄宗福,孙刚,陈曾平. 大视场空间目标光电探测起伏背景抑制算法[J].深圳大学学报理工版,2012,29(No.6(471-580)):471.[doi:10.3724/SP.J.1249.2012.06471]
 HUANG Zong-fu,SUN Gang,and CHEN Zeng-ping.A background clutter suppression algorithm for space target detection in wide field-of-view opto-electronic observation[J].Journal of Shenzhen University Science and Engineering,2012,29(1):471.[doi:10.3724/SP.J.1249.2012.06471]
[5]吴庆阳,曾祥军,黄锦辉,等.数字印模口内三维扫描技术研究[J].深圳大学学报理工版,2013,30(No.1(001-110)):60.[doi:10.3724/SP.J.1249.2013.01060]
 Wu Qingyang,Zeng Xiangjun,Huang Jinhui,et al.Study on digital impression for intraoral 3D scanning[J].Journal of Shenzhen University Science and Engineering,2013,30(1):60.[doi:10.3724/SP.J.1249.2013.01060]
[6]朱安民,陈燕明.基于特征点一致性约束的实时目标跟踪算法[J].深圳大学学报理工版,2013,30(No.3(221-330)):228.[doi:10.3724/SP.J.1249.2013.03228]
 Zhu Anmin and Chen Yanming.A real-time target tracking algorithm based on the consistency constraint of feature points[J].Journal of Shenzhen University Science and Engineering,2013,30(1):228.[doi:10.3724/SP.J.1249.2013.03228]
[7]柳伟,陈旭,梁永生,等.基于时空显著图的可伸缩视频码率控制方法[J].深圳大学学报理工版,2013,30(No.5(441-550)):462.[doi:10.3724/SP.J.1249.2013.05462]
 Liu Wei,Chen Xu,Liang Yongsheng,et al.Rate control for scalable video coding based on spatiotemporal saliency map[J].Journal of Shenzhen University Science and Engineering,2013,30(1):462.[doi:10.3724/SP.J.1249.2013.05462]
[8]张敏,权润爱,苏红,等.光泵连续太赫兹波在生物成像中的应用研究(英文)[J].深圳大学学报理工版,2014,31(2):160.[doi:10.3724/SP.J.1249.2014.02160]
 Zhang Min,Quan Runai,Su Hong,et al.Investigation of optically pumped continuous terahertz laser in biological imaging[J].Journal of Shenzhen University Science and Engineering,2014,31(1):160.[doi:10.3724/SP.J.1249.2014.02160]
[9]李霞,李富生,陈园琴.基于视觉灵敏度与DCT系数的显著性检测[J].深圳大学学报理工版,2014,31(5):464.[doi:10.3724/SP.J.1249.2014.05464]
 Li Xia,Li Fusheng,and Chen Yuanqin.Saliency detection model based on human visual sensitivity and DCT coefficients[J].Journal of Shenzhen University Science and Engineering,2014,31(1):464.[doi:10.3724/SP.J.1249.2014.05464]
[10]李璟,倪东,李胜利,等.超声图像中胎儿头围的自动测量[J].深圳大学学报理工版,2014,31(5):455.[doi:10.3724/SP.J.1249.2014.05455]
 Li Jing,Ni Dong,Li Shengli,et al.The automatic ultrasound measurement of fetal head circumference[J].Journal of Shenzhen University Science and Engineering,2014,31(1):455.[doi:10.3724/SP.J.1249.2014.05455]
[11]胡涛,郭宝平,郭轩.基于游程分析轮廓提取算法的改进[J].深圳大学学报理工版,2009,26(4):405.
 HU Tao,GUO Bao-ping,and GUO Xuan.An improved run-based boundary extraction algorithm[J].Journal of Shenzhen University Science and Engineering,2009,26(1):405.
[12]刘翠响,袁香伟,王宝珠,等.最小均衡化后的行人重识别[J].深圳大学学报理工版,2019,36(4):447.[doi:10.3724/SP.J.1249.2019.04447]
 LIU Cuixiang,YUAN Xiangwei,WANG Baozhu,et al.Minimum equalization for pedestrain re-identification[J].Journal of Shenzhen University Science and Engineering,2019,36(1):447.[doi:10.3724/SP.J.1249.2019.04447]

备注/Memo

备注/Memo:
Received:2019-04-06;Accepted:2019-06-17
Foundation:Natural Science Basic Research Program of Shaanxi Province (2014JM1006); Natural Soft Science Research Program of Shaanxi Province (2014KRM2801); Key R & D Program Projects of Shaanxi Province (2018KW-021)
Corresponding author:Professor HE Xingshi. Email: xsh1002@126.com
Citation:LIU Runqi, HE Xingshi, NAN Yifei, et al.Mining method of public opinion related topic in network multimedia data [J]. Journal of Shenzhen University Science and Engineering, 2020, 37(1): 72-78.(in Chinese)
基金项目:陕西省自然科学基础研究计划资助项目(2014JM1006); 陕西省自然软科学研究计划资助项目(2014KRM2801); 陕西省重点研发计划资助项目(2018KW-021)
作者简介:刘润奇 (1994—),西安工程大学硕士研究生.研究方向:网络舆情事件的关联主题挖掘方法研究.E-mail:rliu164103@163.com
引文:刘润奇,贺兴时,南夷非,等.网络多媒体数据中舆情关联主题的挖掘方法[J]. 深圳大学学报理工版,2020,37(1):72-78.
更新日期/Last Update: 2020-01-30