
1)深圳大学信息工程学院,广东深圳 518060; 2)深圳大学计算机与软件学院,广东深圳 518060

生物组学; 粒子群优化; 样本平衡; 特征选择; 分类模型; 模型选择; 数据挖掘

Model selection based on particle swarm optimization for omics data classification
Yang Junshan1, Ji Zhen1, Xie Weixin1, and Zhu Zexuan2

1)College of Information Engineering, Shenzhen University, Shenzhen 518060, Guangdong Province, P.R.China 2)College of Computer Science and Software Engineering, Shenzhen University, Shenzhen 518060,Guangdong Province, P.R.China

omics dataset; particle swarm optimization; data sampling; feature selection; classification model; model selection; data mining

DOI: 10.3724/SP.J.1249.2016.03264



A new model selection algorithm based on particle swarm optimization is proposed for omics data classification. Specifically, the algorithm is designed to handle the high dimensionality, small sample size and class imbalance problems that are inherent in omics data. The particles encode candidate combinations of data sampling, feature selection, classification models and their corresponding parameter settings. The swarm optimization is targeted at the best classification performance. The particle velocity and position are iteratively updated until some stopping criteria are met and then the optimal solution model combination is output. The simulation results on eight real-world omics datasets show that the proposed model selection algorithm is capable of avoiding the bias introduced by manual settings and leading to accurate and reliable classification performance.
