Wei Zhang, Xiaogang Wang, Deli Zhao, Xiaoou Tang
This paper proposes a simple but effective graph-based agglomerative algorithm, for clustering high-dimensional data. We explore the different roles of two fundamental concepts in graph theory, indegree and outdegree, in the context of clustering. The average indegree reflects the density near a sample, and the average outdegree characterizes the local geometry around a sample. Based on such insights, we define the affinity measure of clusters via the product of average indegree and average outdegree. The product-based affinity makes our algorithm robust to noise. The algorithm has three main advantages: good performance, easy implementation, and high computational efficiency. We test the algorithm on two fundamental computer vision problems: image clustering and object matching. Extensive experiments demonstrate that it outperforms the state-of-the-arts in both applications.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Image Clustering | Fashion-MNIST | Accuracy | 0.627 | GDL |
| Image Clustering | Fashion-MNIST | NMI | 0.66 | GDL |
| Image Clustering | MNIST-full | Accuracy | 0.965 | GDL |
| Image Clustering | MNIST-full | NMI | 0.913 | GDL |
| Image Clustering | USPS | NMI | 0.824 | AGDL |
| Image Clustering | Coil-20 | Accuracy | 0.858 | AGDL |
| Image Clustering | Coil-20 | NMI | 0.937 | AGDL |
| Image Clustering | Coil-20 | NMI | 0.746 | GDL-U |
| Image Clustering | Coil-20 | Accuracy | 0.858 | GDL |
| Image Clustering | Extended Yale-B | NMI | 0.91 | GDL-U |
| Image Clustering | Extended Yale-B | NMI | 0.91 | AGDL |
| Image Clustering | coil-100 | NMI | 0.929 | GDL-U |
| Image Clustering | coil-100 | Accuracy | 0.731 | GDL |
| Image Clustering | MNIST-test | NMI | 0.91 | GDL |
| Image Clustering | MNIST-test | NMI | 0.844 | AGDL |