TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/SemiVL: Semi-Supervised Semantic Segmentation with Vision-...

SemiVL: Semi-Supervised Semantic Segmentation with Vision-Language Guidance

Lukas Hoyer, David Joseph Tan, Muhammad Ferjad Naeem, Luc van Gool, Federico Tombari

2023-11-27Semi-Supervised Semantic SegmentationSegmentationSemantic Segmentation
PaperPDFCode(official)

Abstract

In semi-supervised semantic segmentation, a model is trained with a limited number of labeled images along with a large corpus of unlabeled images to reduce the high annotation effort. While previous methods are able to learn good segmentation boundaries, they are prone to confuse classes with similar visual appearance due to the limited supervision. On the other hand, vision-language models (VLMs) are able to learn diverse semantic knowledge from image-caption datasets but produce noisy segmentation due to the image-level training. In SemiVL, we propose to integrate rich priors from VLM pre-training into semi-supervised semantic segmentation to learn better semantic decision boundaries. To adapt the VLM from global to local reasoning, we introduce a spatial fine-tuning strategy for label-efficient learning. Further, we design a language-guided decoder to jointly reason over vision and language. Finally, we propose to handle inherent ambiguities in class labels by providing the model with language guidance in the form of class definitions. We evaluate SemiVL on 4 semantic segmentation datasets, where it significantly outperforms previous semi-supervised methods. For instance, SemiVL improves the state-of-the-art by +13.5 mIoU on COCO with 232 annotated images and by +6.1 mIoU on Pascal VOC with 92 labels. Project page: https://github.com/google-research/semivl

Results

TaskDatasetMetricValueModel
Semantic SegmentationCOCO 1/512 labeledValidation mIoU50.1SemiVL
Semantic SegmentationCOCO 1/256 labeledValidation mIoU52.8SemiVL
Semantic SegmentationADE20K 1/16 labeledValidation mIoU37.2SemiVL
Semantic SegmentationPASCAL VOC 2012 92 labeledValidation mIoU84SemiVL (ViT-B/16)
Semantic SegmentationPASCAL VOC 2012 92 labeledValidation mIoU77.9UniMatch (ViT-B/16)
Semantic SegmentationADE20K 1/32 labeledValidation mIoU35.1SemiVL
Semantic SegmentationPASCAL VOC 2012 732 labeledValidation mIoU86.7SemiVL (ViT-B/16)
Semantic SegmentationPASCAL VOC 2012 732 labeledValidation mIoU83.3UniMatch (ViT-B/16)
Semantic SegmentationPASCAL VOC 2012 1464 labelsValidation mIoU87.3SemiVL (ViT-B/16
Semantic SegmentationPASCAL VOC 2012 1464 labelsValidation mIoU84UniMatch (ViT-B/16)
Semantic SegmentationCOCO 1/128 labeledValidation mIoU53.6SemiVL
Semantic SegmentationCOCO 1/64 labeledValidation mIoU55.4SemiVL
Semantic SegmentationCityscapes 100 samples labeledValidation mIoU76.2SemiVL (ViT-B/16)
Semantic SegmentationPASCAL VOC 2012 366 labeledValidation mIoU86SemiVL (ViT-B/16)
Semantic SegmentationPASCAL VOC 2012 366 labeledValidation mIoU82UniMatch (ViT-B/16)
Semantic SegmentationCityscapes 6.25% labeledValidation mIoU77.9SemiVL (ViT-B/16)
Semantic SegmentationCOCO 1/32 labeledValidation mIoU56.5SemiVL
Semantic SegmentationPASCAL VOC 2012 183 labeledValidation mIoU85.6SemiVL (ViT-B/16)
Semantic SegmentationPASCAL VOC 2012 183 labeledValidation mIoU80.1UniMatch (ViT-B/16)
10-shot image generationCOCO 1/512 labeledValidation mIoU50.1SemiVL
10-shot image generationCOCO 1/256 labeledValidation mIoU52.8SemiVL
10-shot image generationADE20K 1/16 labeledValidation mIoU37.2SemiVL
10-shot image generationPASCAL VOC 2012 92 labeledValidation mIoU84SemiVL (ViT-B/16)
10-shot image generationPASCAL VOC 2012 92 labeledValidation mIoU77.9UniMatch (ViT-B/16)
10-shot image generationADE20K 1/32 labeledValidation mIoU35.1SemiVL
10-shot image generationPASCAL VOC 2012 732 labeledValidation mIoU86.7SemiVL (ViT-B/16)
10-shot image generationPASCAL VOC 2012 732 labeledValidation mIoU83.3UniMatch (ViT-B/16)
10-shot image generationPASCAL VOC 2012 1464 labelsValidation mIoU87.3SemiVL (ViT-B/16
10-shot image generationPASCAL VOC 2012 1464 labelsValidation mIoU84UniMatch (ViT-B/16)
10-shot image generationCOCO 1/128 labeledValidation mIoU53.6SemiVL
10-shot image generationCOCO 1/64 labeledValidation mIoU55.4SemiVL
10-shot image generationCityscapes 100 samples labeledValidation mIoU76.2SemiVL (ViT-B/16)
10-shot image generationPASCAL VOC 2012 366 labeledValidation mIoU86SemiVL (ViT-B/16)
10-shot image generationPASCAL VOC 2012 366 labeledValidation mIoU82UniMatch (ViT-B/16)
10-shot image generationCityscapes 6.25% labeledValidation mIoU77.9SemiVL (ViT-B/16)
10-shot image generationCOCO 1/32 labeledValidation mIoU56.5SemiVL
10-shot image generationPASCAL VOC 2012 183 labeledValidation mIoU85.6SemiVL (ViT-B/16)
10-shot image generationPASCAL VOC 2012 183 labeledValidation mIoU80.1UniMatch (ViT-B/16)

Related Papers

SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction2025-07-21Deep Learning-Based Fetal Lung Segmentation from Diffusion-weighted MRI Images and Lung Maturity Evaluation for Fetal Growth Restriction2025-07-17DiffOSeg: Omni Medical Image Segmentation via Multi-Expert Collaboration Diffusion Model2025-07-17From Variability To Accuracy: Conditional Bernoulli Diffusion Models with Consensus-Driven Correction for Thin Structure Segmentation2025-07-17Unleashing Vision Foundation Models for Coronary Artery Segmentation: Parallel ViT-CNN Encoding and Variational Fusion2025-07-17SCORE: Scene Context Matters in Open-Vocabulary Remote Sensing Instance Segmentation2025-07-17Unified Medical Image Segmentation with State Space Modeling Snake2025-07-17A Privacy-Preserving Semantic-Segmentation Method Using Domain-Adaptation Technique2025-07-17