[1]郑能恒,张亚磊,李霞.基于模型在线更新和平滑处理的音乐分割算法[J].深圳大学学报理工版,2011,28(No.3(189-282)):271-275.
 ZHENG Neng-heng,ZHANG Ya-lei,and LI Xia.Music segmentation based on model adaptation and smoothing processing[J].Journal of Shenzhen University Science and Engineering,2011,28(No.3(189-282)):271-275.
点击复制

基于模型在线更新和平滑处理的音乐分割算法()
分享到:

《深圳大学学报理工版》[ISSN:1000-2618/CN:44-1401/N]

卷:
第28卷
期数:
2011年No.3(189-282)
页码:
271-275
栏目:
电子与信息科学
出版日期:
2011-05-20

文章信息/Info

Title:
Music segmentation based on model adaptation and smoothing processing
文章编号:
1000-2618(2011)03-0271-05
作者:
郑能恒张亚磊李霞
深圳大学信息工程学院,深圳 518060
Author(s):
ZHENG Neng-hengZHANG Ya-leiand LI Xia
College of Information Engineering, Shenzhen University, Shenzhen 518060, P.R.China
关键词:
声学语音处理音乐分割高斯混合模型置信测度自适应模型更新平滑处理
Keywords:
acousticspeech processingmusic segmentationGaussian mixture modelconfidence measuremodel adaptationsmoothing
分类号:
TN 912;TP 391
文献标志码:
A
摘要:
针对音乐分割中预训练模型和待分割信号间的不匹配问题,提出基于置信测度的自适应模型更新算法.在基于预训练模型的识别结果中,通过置信测度选择可靠的数据进行高斯混合模型在线自适应更新,获得与待分割音乐信号更匹配的声乐/非声乐模型.通过对识别结果进行平滑处理,进一步去除瞬时突变错误. 实验表明,与初始模型和采用全部数据进行模型更新相比,该算法可获得与待分割信号更匹配的高斯混合模型,分割效果更佳.
Abstract:
An online model adaptation technique for music segmentation was proposed.A confidence measure derived from the recognition likelihoods was adopted for selecting the credible data.The selected data was then used for model adaptation.Compared to the pre-trained models,the adapted ones characterize the acoustic properties of the processing signals more accurately.It implies that higher segmentation accuracy can be achieved.A smoothing processing was applied to further reduce the short segment fluctuation errors from the recognition output.Experimental results show that the significant performance improvement due to the proposed algorithms.

参考文献/References:

[1] Lew M,Sebe N,Djeraba C,等.基于内容的多媒体信息检索:最新进展及挑战[J].ACM 多媒体计算、通讯与应用汇刊,2006,2(1):1-19.(英文版)
[2] 谢磊.韵律特征在中文新闻节目故事分割中的应用[J].多媒体系统,2008,14(4):237-253.(英文版)
[3] Su J,Yeh H,Yu P,等.基于内容和上下文的音乐推荐[J].智能系统,2010,25(1):16-26.(英文版)
[4] Cheng S,Wang H,Fu H.基于贝叶斯准则的音频分割[C]//IEEE声学、语音和信号处理国际会议论文集.拉斯维加斯:IEEE出版社,2008:4881-4884.(英文版)
[5] Tardon L,Sammartino S,Barbancho I.音乐/语音分离器的设计[J].美国声学学会杂志,2010,127(1):271-279.(英文版)
[6] Kos M,Grasic M,Vlaj D,等.广播新闻中的语音/音乐在线分割[C]//第16届系统、信号与图像处理国际会议论文集.哈尔基斯(希腊):IEEE 出版社,2009:1-4.(英文版)
[7] Li Y,Wang D.单通道音乐中的歌声和伴奏乐分离[J].IEEE音频、语音和语言处理汇刊,2007,15(4):1475-1487.(英文版)
[8] Maddage N C,XU Chang-sheng,WANG Ye.一种基于支持向量机的音乐分类算法[C]//音乐信息处理国际研讨会论文集.马里兰:约翰-霍普金斯大学,2003:26-30.(英文版)
[9] Du Y,Hu W,Yan Y,等.基于三模型贝叶斯准则的音频分割[C]//IEEE声学、语音和信号处理国际会议论文集.夏威夷(美国):IEEE出版社,2007:205-208.(英文版)
[10] Chou W,Gu L.基于歌声检测的语音/音乐分割算法[C]//IEEE声学、语音和信号处理国际会议论文集.盐湖城:IEEE出版社,2001:865-868.(英文版)
[11] Berenzweig A,Ellis D,Lawrence S.从歌声中识别歌唱家[C]//虚拟合成娱乐音频会议论文集.艾斯堡(芬兰):音频工程学会,2002.(英文版)
[12] Zhang Y,Zhou J.基于多层次分类的音频分割[C]// IEEE声学、语音和信号处理国际会议论文集.蒙特利尔(加拿大):IEEE出版社,2004:349-352.(英文版)
[13] Lu L,Zhang H,Li S.基于内容分析的音频信号分类与分割[J].IEEE语音与音频处理汇刊,2002,10(7):504-516.(英文版)
[14] Reynolds D,Quatieri T,Dunn R.基于混合高斯模型适配的说话人识别[J].数字信号处理,2000,10(1):19-41.(英文版)



[1] Lew M,Sebe N,Djeraba C,et al.Content-based multimedia information retrieval:state of the art and challenges[J].ACM Transactions on Multimedia Computing,Communications and Applications,2006,2(1):1-19.
[2] XIE Lei.Discovering salient prosodic cuesandtheirinteractions for automatic story segmentation in Mandarin broadcast news[J].Multimedia Systems,2008,14(4):237-253.
[3] Su J,Yeh H,Yu P,et al.Music recommendation using content and context information mining[J].Intelligent System,2010,25(1):16-26.
[4] Cheng S,Wang H,Fu H.BIC-based audio segmentation by divide-and-conquer[C]//Proceedings of International Conference on Acoustics,Speech and Signal Processing.Las Vegas (USA):IEEE Press,2008:4841-4844.
[5] Tardon L,Sammartino S,Barbancho I.Design of an efficient music-speech discriminator[J].Journal of the Acoustical Society of America,2010,127(1):271-279.
[6] Kos M,Grasic M,Vlaj D,Kacic Z.Online speech/music segmentation for broadcast news domain[C]//Proceedings of the 16th International Conference on Systems,Signals and Image Processing.Chalkida(Greece):IEEE Press,2009:1-4.
[7] Li Y,Wang D.Separation of singing voice from music accompaniment for monaural recordings[J].IEEE Transaction on Audio,Speech and Language Processing,2007,15(4):1475-1487.
[8] Maddage N C,XU Chang-sheng,WANG Ye.A SVM-based classication approach to musical audio[C]//Proceedings of the 4th International Conference on Music Information Retrieval.Maryland (USA):The Johns Hopkins University,2003:26-30.
[9] Du Y,Hu W,Yan Y,Wang T,Zhang Y.Audio segmentation via tri-model Bayesian information criterion[C]//Proceedings of International Conference on Acoustics,Speech and Signal Processing.Hawaii (USA):IEEE Press,2007:205-208.
[10] Chou W,Gu L.Robust singing detection in speech/music discriminator design[C]//Proceedings of International Conference on Acoustics,Speech and Signal Process.Salt Lake City (USA):IEEE Press,2001:865-868.
[11] Berenzweig A,Ellis D,Lawrence S.Using voice segment to improve artist classification of music[C]//Proceedings of International Conference on Virtual Synthetic Entertainment Audio.Eapoo (Finland):Audio Engineering Society,2002:1-8.
[12] Zhang Y,Zhou J.Audio segmentation based on multi-scale audio classification[C]//Proceedings of International Conference on Acoustics,Speech and Signal Processing.Montreal(Canada):IEEE press,2004:349-353.
[13] Lu L,Zhang H,Li S.Content analysis for audio classification and segmentation[J].IEEE Transactions on Speech and Audio Processing,2002,10(7):504-516.
[14] Reynolds D,Quatieri T,Dunn R.Speaker verification using adapted Gaussian mixture models[J].Digital Signal Processing,2000,10(1):19-41.

备注/Memo

备注/Memo:
收稿日期:2010-04-25;修回日期:2010-11-03
基金项目:国家自然科学基金资助项目(60901061);广东省自然科学基金资助项目(9151806001000025)
作者简介:郑能恒(1974-),男(汉族),福建省福州市人,深圳大学副教授、博士.E-mail:nhzheng@szu.edu.cn
更新日期/Last Update: 2011-05-23