TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Exploring the Limits of Deep Image Clustering using Pretra...

Exploring the Limits of Deep Image Clustering using Pretrained Models

Nikolas Adaloglou, Felix Michels, Hamza Kalisch, Markus Kollmann

2023-03-31Image ClusteringClustering
PaperPDFCode(official)

Abstract

We present a general methodology that learns to classify images without labels by leveraging pretrained feature extractors. Our approach involves self-distillation training of clustering heads based on the fact that nearest neighbours in the pretrained feature space are likely to share the same label. We propose a novel objective that learns associations between image features by introducing a variant of pointwise mutual information together with instance weighting. We demonstrate that the proposed objective is able to attenuate the effect of false positive pairs while efficiently exploiting the structure in the pretrained feature space. As a result, we improve the clustering accuracy over $k$-means on $17$ different pretrained models by $6.1$\% and $12.2$\% on ImageNet and CIFAR100, respectively. Finally, using self-supervised vision transformers, we achieve a clustering accuracy of $61.6$\% on ImageNet. The code is available at https://github.com/HHU-MMBS/TEMI-official-BMVC2023.

Results

TaskDatasetMetricValueModel
Image ClusteringImageNet-100 (TEMI Split)ACCURACY0.8343TEMI CLIP ViT-L (openai)
Image ClusteringImageNet-100 (TEMI Split)ARI0.7581TEMI CLIP ViT-L (openai)
Image ClusteringImageNet-100 (TEMI Split)NMI0.9006TEMI CLIP ViT-L (openai)
Image ClusteringImageNet-100 (TEMI Split)ACCURACY0.8286TEMI MSN ViT-L
Image ClusteringImageNet-100 (TEMI Split)ARI0.7408TEMI MSN ViT-L
Image ClusteringImageNet-100 (TEMI Split)NMI0.8853TEMI MSN ViT-L
Image ClusteringImageNet-100 (TEMI Split)ACCURACY0.7505TEMI DINO ViT-B
Image ClusteringImageNet-100 (TEMI Split)ARI0.6545TEMI DINO ViT-B
Image ClusteringImageNet-100 (TEMI Split)NMI0.8565TEMI DINO ViT-B
Image ClusteringCIFAR-10ARI0.932TEMI CLIP ViT-L (openai)
Image ClusteringCIFAR-10Accuracy0.969TEMI CLIP ViT-L (openai)
Image ClusteringCIFAR-10NMI0.926TEMI CLIP ViT-L (openai)
Image ClusteringCIFAR-10ARI0.885TEMI DINO ViT-B
Image ClusteringCIFAR-10NMI0.886TEMI DINO ViT-B
Image ClusteringCIFAR-100ARI0.612TEMI CLIP ViT-L (openai)
Image ClusteringCIFAR-100Accuracy0.737TEMI CLIP ViT-L (openai)
Image ClusteringCIFAR-100NMI0.799TEMI CLIP ViT-L (openai)
Image ClusteringCIFAR-100ARI0.533TEMI DINO ViT-B
Image ClusteringCIFAR-100Accuracy0.671TEMI DINO ViT-B
Image ClusteringCIFAR-100NMI0.769TEMI DINO ViT-B
Image ClusteringImageNet-200 ACCURACY0.7776TEMI CLIP ViT-L (openai)
Image ClusteringImageNet-200ARI0.6941TEMI CLIP ViT-L (openai)
Image ClusteringImageNet-200NMI0.8839TEMI CLIP ViT-L (openai)
Image ClusteringImageNet-200ARI0.667TEMI MSN ViT-L
Image ClusteringImageNet-200NMI0.8665TEMI MSN ViT-L
Image ClusteringImageNet-200 ACCURACY0.7312TEMI DINO ViT-B
Image ClusteringImageNet-200ARI0.6231TEMI DINO ViT-B
Image ClusteringImageNet-200NMI0.852TEMI DINO ViT-B
Image ClusteringImageNet-50 (TEMI Split)ACCURACY0.8827TEMI CLIP ViT-L (openai)
Image ClusteringImageNet-50 (TEMI Split)ARI0.8272TEMI CLIP ViT-L (openai)
Image ClusteringImageNet-50 (TEMI Split)NMI0.9232TEMI CLIP ViT-L (openai)
Image ClusteringImageNet-50 (TEMI Split)ACCURACY0.8487TEMI MSN ViT-L
Image ClusteringImageNet-50 (TEMI Split)ARI0.7646TEMI MSN ViT-L
Image ClusteringImageNet-50 (TEMI Split)NMI0.8814TEMI MSN ViT-L
Image ClusteringImageNet-50 (TEMI Split)ACCURACY0.801TEMI DINO ViT-B
Image ClusteringImageNet-50 (TEMI Split)ARI0.7093TEMI DINO ViT-B
Image ClusteringImageNet-50 (TEMI Split)NMI0.861TEMI DINO ViT-B
Image ClusteringSTL-10ARI0.968TEMI DINO ViT-B
Image ClusteringSTL-10Accuracy0.985TEMI DINO ViT-B
Image ClusteringSTL-10NMI0.965TEMI DINO ViT-B
Image ClusteringImageNetARI48.4TEMI MSN (ViT-L)
Image ClusteringImageNetAccuracy61.6TEMI MSN (ViT-L)
Image ClusteringImageNetNMI82.5TEMI MSN (ViT-L)
Image ClusteringImageNetARI45.9TEMI DINO (ViT-B)
Image ClusteringImageNetAccuracy58TEMI DINO (ViT-B)
Image ClusteringImageNetNMI81.4TEMI DINO (ViT-B)

Related Papers

Tri-Learn Graph Fusion Network for Attributed Graph Clustering2025-07-18Ranking Vectors Clustering: Theory and Applications2025-07-16Car Object Counting and Position Estimation via Extension of the CLIP-EBC Framework2025-07-11GNN-ViTCap: GNN-Enhanced Multiple Instance Learning with Vision Transformers for Whole Slide Image Classification and Captioning2025-07-09Consistency and Inconsistency in $K$-Means Clustering2025-07-08MC-INR: Efficient Encoding of Multivariate Scientific Simulation Data using Meta-Learning and Clustered Implicit Neural Representations2025-07-03Supercm: Revisiting Clustering for Semi-Supervised Learning2025-06-30Temporal Rate Reduction Clustering for Human Motion Segmentation2025-06-26