[1]薛丽萍,尹俊勋,纪震.基于粒子群优化-模糊聚类的说话人识别[J].深圳大学学报理工版,2008,25(2):178-183.
 XUE Li-ping,YIN Jun-xun,and JI Zhen.Speaker recognition based on particle swarm optimizition and fuzzy clustering analysis[J].Journal of Shenzhen University Science and Engineering,2008,25(2):178-183.
点击复制

基于粒子群优化-模糊聚类的说话人识别()
分享到:

《深圳大学学报理工版》[ISSN:1000-2618/CN:44-1401/N]

卷:
第25卷
期数:
2008年2期
页码:
178-183
栏目:
电子光学与信息工程
出版日期:
2008-04-30

文章信息/Info

Title:
Speaker recognition based on particle swarm optimizition and fuzzy clustering analysis
文章编号:
1000-2618(2008)02-0178-06
作者:
薛丽萍12尹俊勋1纪震2
1)华南理工大学电子与信息学院,广州 510640;
2)深圳大学软件学院,深圳 518060
Author(s):
XUE Li-ping12YIN Jun-xun1and JI Zhen2
1)College of Electronics and Information Engineering,South China University of Technology,Guangzhou 510641,P.R.China
2)College of Sofewave,Shenzhen University,Shenzhen 518060,P.R.China
关键词:
说话人识别与文本无关粒子群优化模糊C均值聚类三粒子群
Keywords:
speaker recognitiontext-independentparticle swarm optimizationfuzzy C-meanstriple-particle swarm
分类号:
TN 912.3;TP 18
文献标志码:
A
摘要:
基于粒子群优化(particle swarm optimization,PSO)提出一种说话人识别算法—三粒子模糊C均值聚类算法.利用3个子群体,每个子群体由规模较小的3个粒子构成,寻求最佳说话人模型.在每次迭代中每个子群体按先后顺序执行PSO算法中的速度更新、位置更新操作和标准FCM算法,对说话人的训练语音数据进行粒子群优化-模糊的软聚类分析,得到聚类中心的最优解,作为该说话人的语音模型.此算法可避免粒子陷入局部最优聚类中心,较准确地记录和估计每个聚类中心的最佳移动方向和历史路径,从而使聚类中心向全局最优解靠近.实验表明,本算法始终稳定地取得优于LBG算法、FCM算法和FRLVQ-FVQ算法的说话人识别性能,对初始聚类中心依赖度低,可有效降低误识率.
Abstract:
A new strategy for speaker recognition,triple-particle fuzzy C-means clustering (FCM),called TP-FCM,was proposed.Three particle sub-swarms were used to search for the best speaker model based on conventional particle swarm optimization (PSO) algorithm,and the three particles were combined into a triple-particle in each sub-swarm.At each iteration,the triple-particle performed the basic PSO operations and the conventional FCM algorithm in sequence.The speakers’ training data were clustered softly,and the best clustering centers were organized as the model of the speaker.This strategy prevented the particle from being trapped in a local optimum,memorizes and estimates the best direction the particle moves toward to the optimum clustering centers.Experimental results demonstrate that the performance of this new strategy is much better than that of LBG,FCM,FRLVQ-FVQ consistently with lower speaker recognition error rates,and the dependence of the final optimum clustering solution on the selection of the initial clustering centers is reduced effectively.

参考文献/References:

[1]赵力.语音信号处理[M].北京:机械工业出版社,2004:236-253.
[2]Soong F K,Rosenberg A E,Rabiner L R,等.矢量量化的说话人识别方法[C]// 国际声学、语音和信号处理会议.美国电气电子工程师学会,1985:387-390 (英文版).
[3]Tran D,Wagner M,Van L T.一种基于模糊C-均值的说话人识别新策略[C]// 第五届国际口语处理会议.澳大利亚悉尼:澳大利亚语音和技术协会,1998:755-758 (英文版).
[4]许文焕,Nandi A K,张基宏.利用加强学习作为预处理的模糊矢量量化算法[J].信号处理,2005,85(7):1315-1333 (英文版).
[5]胡恒滔,龙建忠.基于蚁群算法的模糊C-均值聚类算法在声纹识别中的应用[J].四川大学学报(自然科学版),2007,44(3):543-547.
[6]Kennedy J,Eberhart R.粒子群优化[C]// 国际神经网络会议.美国新泽西州皮斯卡塔韦:美国电气电子工程师学会,1995:1942-1948(英文版).
[7]薛丽萍,尹俊勋,纪震,等.一种应用于隐马尔可夫模型训练的粒子群优化算法[C]// 第8届国际信号处理会议.桂林:美国电气电子工程师学会,2006:791-794 (英文版).
[8]姜来,黄彩玲,纪震.基于粒子群优化算法的矢量量化图像压缩方法[J].深圳大学学报理工版,2006,23(3):268-271.
[9]纪震,廖惠连,许文焕,等.粒子对算法在图像矢量量化中的应用[J].电子学报,2007,38(10):1916-1920.
[10]Garofolo J S,Lamel L F.TIMIT声学-语音学连续语音库[DB/CD].美国费城:语言数据协会,[2007-12-20]http://www.ldc.upenn.edu/Catalog/.

[1]ZHAO Li.Speech Signal Processing[M].Beijing:China Machine Press,236-253(in Chinese).
[2]Soong F K,Rosenberg A E,Rabiner L R,et al.A vector quantization approach to speaker recognition[C]// International Conference on Acoustics,Speech,and Signal Processing,IEEE Press,1985:387-390.
[3]Tran D,Wagner M,Van L T.A proposed decision rule for speaker recognition based on fuzzy C-Means clustering[C]// 5th International Conference on Spoken Language Processing.Sydney Australia:Australian Speech Science and Technology Association (ASSTA),1998:755-758.
[4]XU Wen-huan,Nandi A K,ZHANG Ji-hong.Novel vector quantiser design using reinforced learning as a pre-process[J].Signal Processing,2005,85(7):1315-1333.
[5]HU Heng-tao,LONG Jian-hong.Speaker identification using fussy C-means clustering algorithm based on improved ant colony algorithm[J].Journal of Sichuan University (Natural Science Edition),2007,44(3):543-547 (in Chinese).
[6]Kennedy J,Eberhart R.Particle swarm optimization[C]// Proceedings of IEEE International Conference on Neural Networks.Piscataway:IEEE Service Center,1995:1942-1948.
[7]XUE Li-ping,YIN Jun-xun,JI Zhen,et al.A particle swarm optimization for hidden Markov model training[C]// 8th International Conference on Signal Processing.Guilin:IEEE Press,2006(1-4):791-794.
[8]JIANG Lai,HUANG Cai-ling,JI Zhen. A new PSO-based image compression method[J].Journal of Shenzhen University Science and Engineering.2006,23(3):268-271.
[9]JI Zhen,LIAO Hui-lian,XU Wen-huan,et al.A strategy of particle-pair for vector quantization in image coding[J].Acta Electronics Sinica,2007,38(10):1916-1920 (in Chinese).
[10]Garofolo J S,Lamel L F.TIMIT Acoustic-Phonetic Continuous Speech Corpus[DB/CD].Philadelphia USA:Linguistic Data Consortium.[2007-12-20]http://www.ldc.upenn.edu/Catalog/.

相似文献/References:

[1]解焱陆,张劲松,刘明辉,等.基于分层增长语音活动检测的鲁棒性说话人识别[J].深圳大学学报理工版,2012,29(No.4(283-376)):328.[doi:10.3724/SP.J.1249.2012.04328]
 XIE Yan-lu,ZHANG Jing-song,LIU Ming-hui,et al.Robust speaker recognition based on level-building voice activity detection[J].Journal of Shenzhen University Science and Engineering,2012,29(2):328.[doi:10.3724/SP.J.1249.2012.04328]

备注/Memo

备注/Memo:
收稿日期:2007-12-24;修回日期:2008-03-17
基金项目:国家自然科学基金资助项目(60572100);深圳大学科研启动基金资助项目(200637)
作者简介:薛丽萍(1962-),女(汉族),陕西省铜川市人,深圳大学副教授、华南理工大学博士研究生.E-mail:xuelp@szu.edu.cn
更新日期/Last Update: 2008-05-10