TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/ReCo: Retrieve and Co-segment for Zero-shot Transfer

ReCo: Retrieve and Co-segment for Zero-shot Transfer

Gyungin Shin, Weidi Xie, Samuel Albanie

2022-06-14Unsupervised Semantic Segmentation with Language-image Pre-trainingUnsupervised Semantic SegmentationSegmentationSemantic SegmentationRetrieval
PaperPDFCode(official)Code

Abstract

Semantic segmentation has a broad range of applications, but its real-world impact has been significantly limited by the prohibitive annotation costs necessary to enable deployment. Segmentation methods that forgo supervision can side-step these costs, but exhibit the inconvenient requirement to provide labelled examples from the target distribution to assign concept names to predictions. An alternative line of work in language-image pre-training has recently demonstrated the potential to produce models that can both assign names across large vocabularies of concepts and enable zero-shot transfer for classification, but do not demonstrate commensurate segmentation abilities. In this work, we strive to achieve a synthesis of these two approaches that combines their strengths. We leverage the retrieval abilities of one such language-image pre-trained model, CLIP, to dynamically curate training sets from unlabelled images for arbitrary collections of concept names, and leverage the robust correspondences offered by modern image representations to co-segment entities among the resulting collections. The synthetic segment collections are then employed to construct a segmentation model (without requiring pixel labels) whose knowledge of concepts is inherited from the scalable pre-training process of CLIP. We demonstrate that our approach, termed Retrieve and Co-segment (ReCo) performs favourably to unsupervised segmentation approaches while inheriting the convenience of nameable predictions and zero-shot transfer. We also demonstrate ReCo's ability to generate specialist segmenters for extremely rare objects.

Results

TaskDatasetMetricValueModel
Semantic SegmentationCOCO-Stuff-171mIoU14.8ReCo
Semantic SegmentationCOCO-ObjectmIoU15.7ReCo
Semantic SegmentationADE20KMean IoU (val)11.2ReCo
Semantic SegmentationCityscapes valmIoU24.2ReCo+
Semantic SegmentationCityscapes valpixel accuracy83.7ReCo+
Semantic SegmentationCityscapes valmIoU19.3ReCo
Semantic SegmentationCityscapes valpixel accuracy74.6ReCo
Semantic SegmentationPASCAL Context-59mIoU22.3ReCo
Semantic SegmentationPascalVOC-20mIoU57.7ReCo
Semantic SegmentationKITTI-STEPmIoU31.9ReCo+
Semantic SegmentationKITTI-STEPpixel accuracy75.3ReCo+
Semantic SegmentationKITTI-STEPmIoU29.8ReCo
Semantic SegmentationKITTI-STEPpixel accuracy70.6ReCo
Semantic SegmentationCOCO-Stuff-27mIoU32.6ReCo+
Semantic SegmentationCOCO-Stuff-27pixel accuracy54.1ReCo+
Semantic SegmentationCOCO-Stuff-27mIoU26.3ReCo
Semantic SegmentationCOCO-Stuff-27pixel accuracy46.1ReCo
Unsupervised Semantic SegmentationCOCO-Stuff-171mIoU14.8ReCo
Unsupervised Semantic SegmentationCOCO-ObjectmIoU15.7ReCo
Unsupervised Semantic SegmentationADE20KMean IoU (val)11.2ReCo
Unsupervised Semantic SegmentationCityscapes valmIoU24.2ReCo+
Unsupervised Semantic SegmentationCityscapes valpixel accuracy83.7ReCo+
Unsupervised Semantic SegmentationCityscapes valmIoU19.3ReCo
Unsupervised Semantic SegmentationCityscapes valpixel accuracy74.6ReCo
Unsupervised Semantic SegmentationPASCAL Context-59mIoU22.3ReCo
Unsupervised Semantic SegmentationPascalVOC-20mIoU57.7ReCo
Unsupervised Semantic SegmentationKITTI-STEPmIoU31.9ReCo+
Unsupervised Semantic SegmentationKITTI-STEPpixel accuracy75.3ReCo+
Unsupervised Semantic SegmentationKITTI-STEPmIoU29.8ReCo
Unsupervised Semantic SegmentationKITTI-STEPpixel accuracy70.6ReCo
Unsupervised Semantic SegmentationCOCO-Stuff-27mIoU32.6ReCo+
Unsupervised Semantic SegmentationCOCO-Stuff-27pixel accuracy54.1ReCo+
Unsupervised Semantic SegmentationCOCO-Stuff-27mIoU26.3ReCo
Unsupervised Semantic SegmentationCOCO-Stuff-27pixel accuracy46.1ReCo
10-shot image generationCOCO-Stuff-171mIoU14.8ReCo
10-shot image generationCOCO-ObjectmIoU15.7ReCo
10-shot image generationADE20KMean IoU (val)11.2ReCo
10-shot image generationCityscapes valmIoU24.2ReCo+
10-shot image generationCityscapes valpixel accuracy83.7ReCo+
10-shot image generationCityscapes valmIoU19.3ReCo
10-shot image generationCityscapes valpixel accuracy74.6ReCo
10-shot image generationPASCAL Context-59mIoU22.3ReCo
10-shot image generationPascalVOC-20mIoU57.7ReCo
10-shot image generationKITTI-STEPmIoU31.9ReCo+
10-shot image generationKITTI-STEPpixel accuracy75.3ReCo+
10-shot image generationKITTI-STEPmIoU29.8ReCo
10-shot image generationKITTI-STEPpixel accuracy70.6ReCo
10-shot image generationCOCO-Stuff-27mIoU32.6ReCo+
10-shot image generationCOCO-Stuff-27pixel accuracy54.1ReCo+
10-shot image generationCOCO-Stuff-27mIoU26.3ReCo
10-shot image generationCOCO-Stuff-27pixel accuracy46.1ReCo

Related Papers

SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction2025-07-21Deep Learning-Based Fetal Lung Segmentation from Diffusion-weighted MRI Images and Lung Maturity Evaluation for Fetal Growth Restriction2025-07-17DiffOSeg: Omni Medical Image Segmentation via Multi-Expert Collaboration Diffusion Model2025-07-17From Variability To Accuracy: Conditional Bernoulli Diffusion Models with Consensus-Driven Correction for Thin Structure Segmentation2025-07-17Unleashing Vision Foundation Models for Coronary Artery Segmentation: Parallel ViT-CNN Encoding and Variational Fusion2025-07-17SCORE: Scene Context Matters in Open-Vocabulary Remote Sensing Instance Segmentation2025-07-17Unified Medical Image Segmentation with State Space Modeling Snake2025-07-17A Privacy-Preserving Semantic-Segmentation Method Using Domain-Adaptation Technique2025-07-17