TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Unsupervised Semantic Segmentation by Contrasting Object M...

Unsupervised Semantic Segmentation by Contrasting Object Mask Proposals

Wouter Van Gansbeke, Simon Vandenhende, Stamatios Georgoulis, Luc van Gool

2021-02-11ICCV 2021 10Unsupervised Pre-trainingUnsupervised Semantic SegmentationSemantic SegmentationClustering
PaperPDFCode(official)Code

Abstract

Being able to learn dense semantic representations of images without supervision is an important problem in computer vision. However, despite its significance, this problem remains rather unexplored, with a few exceptions that considered unsupervised semantic segmentation on small-scale datasets with a narrow visual domain. In this paper, we make a first attempt to tackle the problem on datasets that have been traditionally utilized for the supervised case. To achieve this, we introduce a two-step framework that adopts a predetermined mid-level prior in a contrastive optimization objective to learn pixel embeddings. This marks a large deviation from existing works that relied on proxy tasks or end-to-end clustering. Additionally, we argue about the importance of having a prior that contains information about objects, or their parts, and discuss several possibilities to obtain such a prior in an unsupervised manner. Experimental evaluation shows that our method comes with key advantages over existing works. First, the learned pixel embeddings can be directly clustered in semantic groups using K-Means on PASCAL. Under the fully unsupervised setting, there is no precedent in solving the semantic segmentation task on such a challenging benchmark. Second, our representations can improve over strong baselines when transferred to new datasets, e.g. COCO and DAVIS. The code is available.

Results

TaskDatasetMetricValueModel
Semantic SegmentationCOCO-Stuff-81Pixel Accuracy8.8MaskContrast (ResNet-50)
Semantic SegmentationCOCO-Stuff-81mIoU3.7MaskContrast (ResNet-50)
Semantic SegmentationImageNet-S-50mIoU (test)24.2MaskContrast (+Saliency map)
Semantic SegmentationImageNet-S-50mIoU (val)24.6MaskContrast (+Saliency map)
Semantic SegmentationPASCAL VOC 2012 valClustering [mIoU]44.2MaskContrast (Saliency)
Semantic SegmentationPASCAL VOC 2012 valLinear Classifier [mIoU]63.9MaskContrast (Saliency)
Semantic SegmentationPASCAL VOC 2012 valClustering [mIoU]35MaskContrast
Semantic SegmentationPASCAL VOC 2012 valLinear Classifier [mIoU]58.4MaskContrast
Unsupervised Semantic SegmentationCOCO-Stuff-81Pixel Accuracy8.8MaskContrast (ResNet-50)
Unsupervised Semantic SegmentationCOCO-Stuff-81mIoU3.7MaskContrast (ResNet-50)
Unsupervised Semantic SegmentationImageNet-S-50mIoU (test)24.2MaskContrast (+Saliency map)
Unsupervised Semantic SegmentationImageNet-S-50mIoU (val)24.6MaskContrast (+Saliency map)
Unsupervised Semantic SegmentationPASCAL VOC 2012 valClustering [mIoU]44.2MaskContrast (Saliency)
Unsupervised Semantic SegmentationPASCAL VOC 2012 valLinear Classifier [mIoU]63.9MaskContrast (Saliency)
Unsupervised Semantic SegmentationPASCAL VOC 2012 valClustering [mIoU]35MaskContrast
Unsupervised Semantic SegmentationPASCAL VOC 2012 valLinear Classifier [mIoU]58.4MaskContrast
10-shot image generationCOCO-Stuff-81Pixel Accuracy8.8MaskContrast (ResNet-50)
10-shot image generationCOCO-Stuff-81mIoU3.7MaskContrast (ResNet-50)
10-shot image generationImageNet-S-50mIoU (test)24.2MaskContrast (+Saliency map)
10-shot image generationImageNet-S-50mIoU (val)24.6MaskContrast (+Saliency map)
10-shot image generationPASCAL VOC 2012 valClustering [mIoU]44.2MaskContrast (Saliency)
10-shot image generationPASCAL VOC 2012 valLinear Classifier [mIoU]63.9MaskContrast (Saliency)
10-shot image generationPASCAL VOC 2012 valClustering [mIoU]35MaskContrast
10-shot image generationPASCAL VOC 2012 valLinear Classifier [mIoU]58.4MaskContrast

Related Papers

SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction2025-07-21Tri-Learn Graph Fusion Network for Attributed Graph Clustering2025-07-18DiffOSeg: Omni Medical Image Segmentation via Multi-Expert Collaboration Diffusion Model2025-07-17SCORE: Scene Context Matters in Open-Vocabulary Remote Sensing Instance Segmentation2025-07-17Unified Medical Image Segmentation with State Space Modeling Snake2025-07-17A Privacy-Preserving Semantic-Segmentation Method Using Domain-Adaptation Technique2025-07-17SAMST: A Transformer framework based on SAM pseudo label filtering for remote sensing semi-supervised semantic segmentation2025-07-16Ranking Vectors Clustering: Theory and Applications2025-07-16