TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Selective Pseudo-label Clustering

Selective Pseudo-label Clustering

Louis Mahon, Thomas Lukasiewicz

2021-07-22Image ClusteringClustering
PaperPDFCode(official)

Abstract

Deep neural networks (DNNs) offer a means of addressing the challenging task of clustering high-dimensional data. DNNs can extract useful features, and so produce a lower dimensional representation, which is more amenable to clustering techniques. As clustering is typically performed in a purely unsupervised setting, where no training labels are available, the question then arises as to how the DNN feature extractor can be trained. The most accurate existing approaches combine the training of the DNN with the clustering objective, so that information from the clustering process can be used to update the DNN to produce better features for clustering. One problem with this approach is that these ``pseudo-labels'' produced by the clustering algorithm are noisy, and any errors that they contain will hurt the training of the DNN. In this paper, we propose selective pseudo-label clustering, which uses only the most confident pseudo-labels for training the~DNN. We formally prove the performance gains under certain conditions. Applied to the task of image clustering, the new approach achieves a state-of-the-art performance on three popular image datasets. Code is available at https://github.com/Lou1sM/clustering.

Results

TaskDatasetMetricValueModel
Image ClusteringFashion-MNISTAccuracy0.679SPC
Image ClusteringFashion-MNISTNMI0.735SPC
Image ClusteringMNIST-fullAccuracy0.992SPC
Image ClusteringMNIST-fullNMI0.975SPC
Image ClusteringUSPSAccuracy0.984SPC
Image ClusteringUSPSNMI0.954SPC

Related Papers

Tri-Learn Graph Fusion Network for Attributed Graph Clustering2025-07-18Ranking Vectors Clustering: Theory and Applications2025-07-16Car Object Counting and Position Estimation via Extension of the CLIP-EBC Framework2025-07-11GNN-ViTCap: GNN-Enhanced Multiple Instance Learning with Vision Transformers for Whole Slide Image Classification and Captioning2025-07-09Consistency and Inconsistency in $K$-Means Clustering2025-07-08MC-INR: Efficient Encoding of Multivariate Scientific Simulation Data using Meta-Learning and Clustered Implicit Neural Representations2025-07-03Supercm: Revisiting Clustering for Semi-Supervised Learning2025-06-30Temporal Rate Reduction Clustering for Human Motion Segmentation2025-06-26