深圳大学学报理工版

挖掘并保持数据分布信息是无监督降维的核心问题,为解决传统无监督降维方法大多数只考虑数据分布的局部信息或者全局信息,数据分布信息在低维空间难以保持的缺点,提出一种同时考虑数据分布的全局和局部信息的自适应稀疏表示引导的无监督降维(adaptive sparse representation guided unsupervised dimensionality reduction, ASR_UDR)方法.用稀疏表示挖掘高维空间数据分布的全局信息,通过约束投影后的数据保持图上的平滑性,挖掘数据分布的局部信息,并将这两个过程统一到一个框架中,使之相辅相成,实现数据分布信息的自适应挖掘和数据降维.在WarpAR10P、USPS、MultiB、DLBCLA和DLBCLB数据集上的实验结果表明,与已有的同类无监督降维方法相比,所提方法在显著减少数据维数的同时,可更好地提升后续学习算法的性能.

How to mine and preserve data distribution information is the core problem of unsupervised dimensionality reduction. Most of the traditional unsupervised dimensionality reduction methods only consider the local information or global information of data distribution, and the data distribution information is difficult to be preserved in the low dimensional space. To solve this problem, we propose an adaptive spare representation guided unsupervised dimensionality reduction method to consider the global and local information of the data distribution simultaneously. In this method, the sparse representation is used to mine the global information of high-dimensional data distribution, and the graph smoothness is preserved to mine the local information of data distribution by constraining the projected data during the projection process, in which the graph is represented by the sparse representation coefficient matrix. These two processes are integrated into a framework in order to achieve the mutual guidance of mining information of data distribution and unsupervised dimensionality reduction. The experimental results on the data sets WarpAR10P, USPS, MultiB, DLBCLA and DLBCLB show that compared with the related unsupervised dimensionality reduction methods, the proposed method effectively improves the performance of subsequent learning algorithm meanwhile significantly reducing the data dimensionality.

引言
1 基础知识
2 稀疏表示引导的无监督降维
3 实验
4 结语

图1 自适应稀疏表示引导的无监督降维算法描述<br/>Fig.1 The algorithm description of adaptive sparse representation guided unsupervised dimensionality reduction

图1 自适应稀疏表示引导的无监督降维算法描述
Fig.1 The algorithm description of adaptive sparse representation guided unsupervised dimensionality reduction

表1 实验所用数据集基本信息<br/>Table 1 The information of data sets used in this experiment个

表1 实验所用数据集基本信息
Table 1 The information of data sets used in this experiment个

表2 7种方法在5个数据集上的聚类精度比较1)((-overx)±s)<br/>Table 2 The results of clustering accuracy on all data sets((-overx)±s)%

表2 7种方法在5个数据集上的聚类精度比较1)((-overx)±s)
Table 2 The results of clustering accuracy on all data sets((-overx)±s)%

表3 7种方法在5个数据集上的归一化互信息比较1)((-overx)±s)<br/>Table 3 The results of normalized mutual information on all data sets((-overx)±s)%

表3 7种方法在5个数据集上的归一化互信息比较1)((-overx)±s)
Table 3 The results of normalized mutual information on all data sets((-overx)±s)%

图2 ASR_UDR方法在5个数据集上不同参数下的聚类ACC<br/>Fig.2 (Color online)Clustering ACC on five data sets with different parameters for ASR_UDR

图2 ASR_UDR方法在5个数据集上不同参数下的聚类ACC
Fig.2 (Color online)Clustering ACC on five data sets with different parameters for ASR_UDR

图3 ASR_UDR方法在5个数据集上不同参数下的聚类NMI<br/>Fig.3 (Color online)Clustering NMI on five data sets with different parameters for ASR_UDR

图3 ASR_UDR方法在5个数据集上不同参数下的聚类NMI
Fig.3 (Color online)Clustering NMI on five data sets with different parameters for ASR_UDR

[1] SCOTT D W. Multivariate density estimation: theory, practice, and visualization, second edition[M]. 2nd ed. Hoboken, USA: John Wiley & Sons Inc, 2015: 195-217.
[2] FISHER R A. The use of multiple measurements in taxonomic problems[J]. Annals of Eugenics, 1936, 7(2): 179-188.
[3] FLAMARY R, CUTURI M, COURTY N, et al. Wasserstein discriminant analysis[J]. Machine Learning, 2018, 107(12): 1923-1945.
[4] ÖRNEK C, VURAL E. Nonlinear supervised dimensionality reduction via smooth regular embeddings[J]. Pattern Recognition, 2019, 87: 55- 66.
[5] CHEN Puhua, JIAO Licheng, LIU Fang, et al. Semi-supervised double sparse graphs based discriminant analysis for dimensionality reduction[J]. Pattern Recognition, 2017, 61: 361-378.
[6] WANG Sheng, LU Jianfeng, GU Xingjian, et al. Semi-supervised linear discriminant analysis for dimension reduction and classification[J]. Pattern Recognition, 2016, 57(2016): 179-189.
[7] SANODIYA R K, SAHA S, MATHEW J. Semi-supervised orthogonal discriminant analysis with relative distance: integration with a MOO approach[J]. Soft Computing, 2020, 24(3): 1599-1618.
[8] YAN Shuicheng, XU Dong, ZHANG Benyun, et al. Graph embedding and extensions: a general framework for dimensionality reduction[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2007, 29(1): 40-51.
[9] ROWEIS S T, SAUL L K. Nonlinear dimensionality reduction by locally linear embedding[J]. Science, 2000, 290(5500): 2323-2326.
[10] HE Xiaofei, NIYOGI P. Locality preserving projections[C]// Advances in Neural Information Processing Systems. Vancouver, Canada: The MIT Press, 2004: 153-160.
[11] WEN Jie, XU Yong, LIU Hong. Incomplete multi-view spectral clustering with adaptive graph learning[J]. IEEE Transactions on Cybernetics. 2020, 50(4): 1418-1429.
[12] ZHANG Limei, QIAO Lishan, CHEN Songcan. Graph-optimized locality preserving projections[J]. Pattern Recognition, 2010, 43(6): 1993-2002.
[13] NIE Feiping, WANG Xiaoqian, HUANG Heng. Clustering and projected clustering with adaptive neighbors[C]// Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York, USA: ACM, 2014: 977-986.
[14] COX M A, COX T F. Multidimensional scaling[M]// Handbook of data visualization. CHEN C H. HÄROLE W, UNWIN A. Berlin: Springer, 2008:315-347.
[15] TENENBAUM J B, De SILVA V, LANGFORD J C . A global geometric framework for nonlinear dimensionality reduction[J]. Science, 2000, 290(5500): 2319-2323.
[16] QIAO Lishan, CHEN Songcan, TAN Xiaoyang. Sparsity preserving projections with applications to face recognition[J]. Pattern Recognition, 2010, 43(1): 331-341.
[17] QI Miao, LU Shuang, HUANG Xing, et al. Dimensionality reduction via representation and affinity learning[C]// The 4th International Conference on Systems and Informatics. Nanjing, China: IEEE, 2017: 1203-1208.
[18] GUO Xiaojie. Robust subspace segmentation by simultaneously learning data representations and their affinity matrix[C]// Proceedings of the 24th International Joint Conference on Artificial Intelligence. Buenos Aires, Argentina: AAAI Press, 2015: 3547-3553.
[19] HINTON G E, SALAKHUTDINOV R R. Reducing the dimensionality of data with neural networks[J]. Science, 2006, 313(5786): 504-507.
[20] ABAVISANI M, PATEL V M. Deep sparse representation-based classification[J]. IEEE Signal Processing Letters, 2019, 26(6): 948-952.
[21] TAO Xinmin, WANG Ruotong, CHANG Rui, et al. Spectral clustering algorithm using density-sensitive distance measure with global and local consistencies[J]. Knowledge-Based Systems, 2019, 170: 26- 42.
[22] BOYD S, VANDENBERGHE L. Convex optimization[M]. Cambridge, UK: Cambridge University Press, 2004.

备注

引言

1 基础知识

2 稀疏表示引导的无监督降维

3 实验

4 结语

期刊信息

备注

引言

1 基础知识

2 稀疏表示引导的无监督降维

3 实 验

4 结 语

期刊信息

3 实验

4 结语