Divide-and-conquer based Large-Scale Spectral Clustering

Hongmin Li, Xiucai Ye, Akira Imakura, Tetsuya Sakurai

2021-04-30Image/Document Clustering Clustering

Abstract

Spectral clustering is one of the most popular clustering methods. However, how to balance the efficiency and effectiveness of the large-scale spectral clustering with limited computing resources has not been properly solved for a long time. In this paper, we propose a divide-and-conquer based large-scale spectral clustering method to strike a good balance between efficiency and effectiveness. In the proposed method, a divide-and-conquer based landmark selection algorithm and a novel approximate similarity matrix approach are designed to construct a sparse similarity matrix within low computational complexities. Then clustering results can be computed quickly through a bipartite graph partition process. The proposed method achieves a lower computational complexity than most existing large-scale spectral clustering methods. Experimental results on ten large-scale datasets have demonstrated the efficiency and effectiveness of the proposed method. The MATLAB code of the proposed method and experimental datasets are available at https://github.com/Li-Hongmin/MyPaperWithCode.

Results

Task	Dataset	Metric	Value	Model
Image/Document Clustering	pendigits	Accuracy (%)	81.55	LSC-R
Image/Document Clustering	pendigits	NMI	79.15	LSC-R
Image/Document Clustering	pendigits	runtime (s)	0.77	LSC-R
Image/Document Clustering	pendigits	Accuracy (%)	74.02	LSC-K
Image/Document Clustering	pendigits	NMI	81.37	LSC-K
Image/Document Clustering	pendigits	runtime (s)	1.2	LSC-K
Image/Document Clustering	pendigits	Accuracy (%)	81.68	U-SPEC
Image/Document Clustering	pendigits	NMI	81.68	U-SPEC
Image/Document Clustering	pendigits	runtime (s)	2.07	U-SPEC

Divide-and-conquer based Large-Scale Spectral Clustering

Abstract

Results

Related Papers

Divide-and-conquer based Large-Scale Spectral Clustering

Abstract

Results

Related Papers