TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Scaling up Multi-domain Semantic Segmentation with Sentenc...

Scaling up Multi-domain Semantic Segmentation with Sentence Embeddings

Wei Yin, Yifan Liu, Chunhua Shen, Baichuan Sun, Anton Van Den Hengel

2022-02-04SegmentationSemantic SegmentationSentence EmbeddingsDepth EstimationInstance SegmentationMonocular Depth Estimation
PaperPDF

Abstract

We propose an approach to semantic segmentation that achieves state-of-the-art supervised performance when applied in a zero-shot setting. It thus achieves results equivalent to those of the supervised methods, on each of the major semantic segmentation datasets, without training on those datasets. This is achieved by replacing each class label with a vector-valued embedding of a short paragraph that describes the class. The generality and simplicity of this approach enables merging multiple datasets from different domains, each with varying class labels and semantics. The resulting merged semantic segmentation dataset of over 2 Million images enables training a model that achieves performance equal to that of state-of-the-art supervised methods on 7 benchmark datasets, despite not using any images therefrom. By fine-tuning the model on standard semantic segmentation datasets, we also achieve a significant improvement over the state-of-the-art supervised segmentation on NYUD-V2 and PASCAL-context at 60% and 65% mIoU, respectively. Based on the closeness of language embeddings, our method can even segment unseen labels. Extensive experiments demonstrate strong generalization to unseen image domains and unseen labels, and that the method enables impressive performance improvements in downstream applications, including depth estimation and instance segmentation.

Results

TaskDatasetMetricValueModel
Depth EstimationKITTI Eigen splitabsolute relative error0.14SIW
Semantic SegmentationCamVidMean IoU83.7SIW
Semantic SegmentationKITTI Semantic SegmentationMean IoU (class)68.9SIW
Semantic SegmentationWildDashMean IoU69.7SIW
Semantic SegmentationPASCAL ContextmIoU54.2SIW(Segformer-B5)
Semantic SegmentationPASCAL VOC 2010 testMean IoU81.1SIW
3DKITTI Eigen splitabsolute relative error0.14SIW
Instance SegmentationCOCO minivalmask AP41.4SIW
10-shot image generationCamVidMean IoU83.7SIW
10-shot image generationKITTI Semantic SegmentationMean IoU (class)68.9SIW
10-shot image generationWildDashMean IoU69.7SIW
10-shot image generationPASCAL ContextmIoU54.2SIW(Segformer-B5)
10-shot image generationPASCAL VOC 2010 testMean IoU81.1SIW

Related Papers

SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction2025-07-21From Neurons to Semantics: Evaluating Cross-Linguistic Alignment Capabilities of Large Language Models via Neurons Alignment2025-07-20Deep Learning-Based Fetal Lung Segmentation from Diffusion-weighted MRI Images and Lung Maturity Evaluation for Fetal Growth Restriction2025-07-17DiffOSeg: Omni Medical Image Segmentation via Multi-Expert Collaboration Diffusion Model2025-07-17From Variability To Accuracy: Conditional Bernoulli Diffusion Models with Consensus-Driven Correction for Thin Structure Segmentation2025-07-17Unleashing Vision Foundation Models for Coronary Artery Segmentation: Parallel ViT-CNN Encoding and Variational Fusion2025-07-17SCORE: Scene Context Matters in Open-Vocabulary Remote Sensing Instance Segmentation2025-07-17Unified Medical Image Segmentation with State Space Modeling Snake2025-07-17