Li Xia,Li Fusheng,and Chen Yuanqin.Saliency detection model based on human visual sensitivity and DCT coefficients[J].Journal of Shenzhen University Science and Engineering,2014,31(5):464-472.[doi:10.3724/SP.J.1249.2014.05464]





Saliency detection model based on human visual sensitivity and DCT coefficients
1)深圳大学信息工程学院,深圳市现代通信与信息处理重点实验室,深圳 518060;
Li Xia1 Li Fusheng1 and Chen Yuanqin2
1) College of Information Engineering, Shenzhen University, Shenzhen Key Laboratory of ACIP, Shenzhen 518060, P.R.China
2) Information Engineering Schoot, Nanchang University, Nanchang 330031, P.R.China
图像处理 显著性检测 离散余弦变换 空间距离 人类视觉灵敏度 眼动跟踪数据 视频编码
image processing saliency detect discrete cosine transformation spatial distance human visual sensitivity eye-tracking dataset video coding
TP 391
提出一种基于人类视觉灵敏度与空间加权离散余弦系数差异度的显著性检测模型.该模型将图像块的离散余弦低频系数作为其特征向量,以取代颜色和亮度等基本特征.每个图像块的显著性不仅计算与其余所有图像块的空间加权特征差异度之和,还用人类视觉灵敏度加权.通过与6种典型的显著性检测模型在3个眼动跟踪数据集上进行对比实验,结果表明,该模型显著性检测性能优于所有对比算法.此外,将该显著性检测模型用于新一代高效率视频编码(high efficiency video coding,HEVC)中也获得了很好的效果.
A new saliency detection model based on human visual sensitivity and spatial weighted dissimilarity of discrete cosine transformation (DCT) coefficients is proposed. The DCT coefficients were used as the feature vector to replace the color and intensity of image patches to compute contrast. The salient value for each image patch was calculated not only by the dissimilarity between the DCT coefficients of this patch and other patches in the whole image but also by the dissimilarity in weighted human visual sensitivity. The experimental results show that the proposed saliency detection model outperforms other state-of-the-art detection models in three eye-tracking datasets. In addition, the proposed model is also applied to new video coding technology, namely high efficiency video coding (HEVC), and achiev better performance than conventional algorithms.


[1] Grill-Spector K,Malach R.The human visual cortex[J].Annual Review Neuroscience,2004,27(1):649-677.
[2] Wang Z,Lu L G,Bovik A C.Foveation scalable video coding with automatic fixation selection[J].IEEE Transactions on Image Processing,2003,12(2):243-254.
[3] Rutishauser U,Walther D,Koch C,et al.Is bottom-up attention useful for object recognition?[C]// Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition.Washington:IEEE Press,2004,2:37-44.
[4] Han Junwei,Ngan K N,Li Mingjing,et al.Unsupervised extraction of visual attention objects in color images[J].IEEE Transactions on Circuits and Systems for Video Technology,2006,16(1):141-145.
[5] Treisman A M, Gelade G.A feature-integration theory of attention[J].Cognitive Psychology,1980,12(1):97-136.
[6] Harel J,Koch C,Perona P.Graph-based visual saliency[C]// Advances in Neural Information Processing Systems.Cambridge(USA):MIT Press,2007:545-552.
[7] Bruce N D B,Tsotsos J K.Saliency based on information maximization[C]// Advances in Neural Information Processing Systems.Cambridge(USA):MIT Press,2006,18:155-163.
[8] Goferman S,Zelnik-Manor L,Tal A.Context-aware saliency detection[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2012,34(10):1915-1926.
[9] Guo Chenlei,Ma Qi,Zhang Liming.Spatio-temporal saliency detection using phase spectrum of quaternion Fourier transform[C]// IEEE Conference on Computer Vision and Pattern Recognition.Alaska(USA):IEEE Press.2008:1-8.
[10] Murray N,Vanrell M,Otazu X,et al.Saliency estimation using a non-parametric low-level vision model[C]// IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Providence(USA):IEEE Press,2011:433-440.
[11] Hou X D,Harel J,Koch C.Image signature: highlighting sparse salient regions[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2012,34(1):194-201.
[12] Ahmed N,Natarajan T,Rao K R.Discrete cosine transform[J].IEEE Transactions on Computers,1974,23(1):90-93.
[13] Geisler W S,Perry J S.Real-time foveated multiresolution system for low-bandwidth video communication[C]// SPIE Proceedings in Human Vision and Electronic Imaging.[S.l.]:SPIE,1998:293-305.
[14] Itti L,Koch C,Niebur E.A model of saliency-based visual attention for rapid scene analysis[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,1998,20(11):1254-1259.
[15] Abutaleb A,Eloteifi A.Automatic thresholding of gray-level pictures using 2D entropy[J]. Computer Vision,Graphics,and Image Processing,1989,47(1):22-32.
[16] Jain A K.Fundamentals of digital image processing[M].Noida(India):Pearson Education India,2007.
[17] Li J,Levine M D,An X,et al.Visual saliency based on scale-space analysis in the frequency domain[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2013,35(4): 996-1010.
[18] Xu Juan,Jiang Ming,Wang Shuo,et al.Predicting human gaze beyond pixels[J].Journal of Vision,2014,14(1): 28-1-28-20.
[19] Judd T,Ehinger K,Durand F,et al.Learning to predict where humans look[C]// The 12th International Conference on Computer Vision.Kyoto(Japan):IEEE Press,2009:2106-2113.
[20] Borji A,Tavakoli H R,Sihite D N,et al.Analysis of scores,datasets,and models in visual saliency prediction[C]// In Proceedings of International Conference on Computer Vision (ICCV).Sydney(Australia):IEEE Press,2013:921-928,
[21] Sullivan G J,Topiwala P N,Luthra A.The H.264/AVC advanced video coding standard:overview and introduction to the fidelity range extensions[C]// SPIE Conference on Applications of Digital Image Processing XXVII.[S.l.]:SPIE,2004,5558:454-474.
[22] Sullivan G J,Ohm J-R,Han W-J,et al.Overview of the high efficiency video coding (HEVC) standard[J].IEEE Transactions on Circuits and Systems for Video Technology,2012,22(12):1649-1668.
[23] Hadizadeh H,Bajic I V.Saliency-aware video compression[J].IEEE Transactions on Image Processing,2014,23(1):19-33.
[24] Chen Z,Guillemot C.Perceptually-friendly H.264/AVC video coding based on foveated just-noticeable-distortion model[J].IEEE Transactions on Circuits and Systems for Video Technology,2010,20(6):806-819.
[25] Hadizadeh H,Enriquez M J,Bajic I V.Eye-tracking database for a set of standard video sequences[J].IEEE Transactions on Image Processing,2012,21(2):898-903.
[26] Li Z C,Qin S Y,Itti L.Visual attention guided bit allocation in video compression[J].Image and Vision Computing,2011,29(1): 1-14.
[27] Rémi D C,Jean-Baptiste K,Laurent A,et al.V.‘x265’[CP/OL].Paris:VideoLAN,2014[2014-03-31].http: //www.videolan.org/developers/x265.html.


[1]张 敏,阮双琛,杨 珺,等.连续太赫兹波实时透射成像实验研究[J].深圳大学学报理工版,2007,24(4):384.
 ZHANG Min,RUAN Shuang-chen,YANG Jun,et al.Experimental study of continuous-wave terahertz radiation real-time transmission imaging[J].Journal of Shenzhen University Science and Engineering,2007,24(5):384.
 HU Tao,GUO Bao-ping,and GUO Xuan.An improved run-based boundary extraction algorithm[J].Journal of Shenzhen University Science and Engineering,2009,26(5):405.
 HU Yuan-yuan and NIU Xia-mu.Image quality assessment based on human visibility threshold theory and structural similarity[J].Journal of Shenzhen University Science and Engineering,2010,27(5):185.
 SONG Yuan-jia,ZHANG Wei,YANG Zheng-wei,et al.Debond defect detection in shell of solid rocket motor by thermal wave nondestructive testing[J].Journal of Shenzhen University Science and Engineering,2012,29(5):252.[doi:10.3724/SP.J.1249.2012.03252]
[5]黄宗福,孙刚,陈曾平. 大视场空间目标光电探测起伏背景抑制算法[J].深圳大学学报理工版,2012,29(No.6(471-580)):471.[doi:10.3724/SP.J.1249.2012.06471]
 HUANG Zong-fu,SUN Gang,and CHEN Zeng-ping.A background clutter suppression algorithm for space target detection in wide field-of-view opto-electronic observation[J].Journal of Shenzhen University Science and Engineering,2012,29(5):471.[doi:10.3724/SP.J.1249.2012.06471]
 Wu Qingyang,Zeng Xiangjun,Huang Jinhui,et al.Study on digital impression for intraoral 3D scanning[J].Journal of Shenzhen University Science and Engineering,2013,30(5):60.[doi:10.3724/SP.J.1249.2013.01060]
 Zhang Min,Quan Runai,Su Hong,et al.Investigation of optically pumped continuous terahertz laser in biological imaging[J].Journal of Shenzhen University Science and Engineering,2014,31(5):160.[doi:10.3724/SP.J.1249.2014.02160]
 Li Jing,Ni Dong,Li Shengli,et al.The automatic ultrasound measurement of fetal head circumference[J].Journal of Shenzhen University Science and Engineering,2014,31(5):455.[doi:10.3724/SP.J.1249.2014.05455]
 Qiu Wensheng,Niu Lihong,Su Binghua,et al.Design of embedded super-resolution restoration system based on ARM[J].Journal of Shenzhen University Science and Engineering,2015,32(5):311.[doi:10.3724/SP.J.1249.2015.0]
 WANG Na and LI Xia.An improved edge detection algorithm based on the Canny operator[J].Journal of Shenzhen University Science and Engineering,2005,22(5):149.


Foundation:Science and Technology Planning Project of Guangdong Province(2011B010200045);Accelerated Programs of Shenzhen Key Laboratory(CXB201105060068A)
Corresponding author:Professor Li Xia.E-mail: lixia@szu.edu.cn
Citation:Li Xia,Li Fusheng,Chen Yuanqin.Saliency detection model based on human visual sensitivity and DCT coefficients[J]. Journal of Shenzhen University Science and Engineering, 2014, 31(5): 464-472.(in Chinese)
引文:李霞,李富生,陈园琴.基于视觉灵敏度与DCT系数的显著性检测[J]. 深圳大学学报理工版,2014,31(5):464-472.
更新日期/Last Update: 2014-09-11