TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Open-Vocabulary Segmentation with Semantic-Assisted Calibr...

Open-Vocabulary Segmentation with Semantic-Assisted Calibration

Yong liu, Sule Bai, Guanbin Li, Yitong Wang, Yansong Tang

2023-12-07CVPR 2024 1AttributeOpen Vocabulary Semantic Segmentation
PaperPDFCode(official)Code(official)

Abstract

This paper studies open-vocabulary segmentation (OVS) through calibrating in-vocabulary and domain-biased embedding space with generalized contextual prior of CLIP. As the core of open-vocabulary understanding, alignment of visual content with the semantics of unbounded text has become the bottleneck of this field. To address this challenge, recent works propose to utilize CLIP as an additional classifier and aggregate model predictions with CLIP classification results. Despite their remarkable progress, performance of OVS methods in relevant scenarios is still unsatisfactory compared with supervised counterparts. We attribute this to the in-vocabulary embedding and domain-biased CLIP prediction. To this end, we present a Semantic-assisted CAlibration Network (SCAN). In SCAN, we incorporate generalized semantic prior of CLIP into proposal embedding to avoid collapsing on known categories. Besides, a contextual shift strategy is applied to mitigate the lack of global context and unnatural background noise. With above designs, SCAN achieves state-of-the-art performance on all popular open-vocabulary segmentation benchmarks. Furthermore, we also focus on the problem of existing evaluation system that ignores semantic duplication across categories, and propose a new metric called Semantic-Guided IoU (SG-IoU).

Results

TaskDatasetMetricValueModel
Open Vocabulary Semantic SegmentationADE20K-847mIoU14SCAN
Open Vocabulary Semantic SegmentationPASCAL Context-459mIoU16.7SCAN
Open Vocabulary Semantic SegmentationPascalVOC-20mIoU97.2SCAN
Open Vocabulary Semantic SegmentationPASCAL Context-59mIoU59.3SCAN
Open Vocabulary Semantic SegmentationADE20K-150mIoU33.5SCAN

Related Papers

MGFFD-VLM: Multi-Granularity Prompt Learning for Face Forgery Detection with VLM2025-07-16Non-Adaptive Adversarial Face Generation2025-07-16Attributes Shape the Embedding Space of Face Recognition Models2025-07-15COLIBRI Fuzzy Model: Color Linguistic-Based Representation and Interpretation2025-07-15Personalized OVSS: Understanding Personal Concept in Open-Vocabulary Semantic Segmentation2025-07-15Ref-Long: Benchmarking the Long-context Referencing Capability of Long-context Language Models2025-07-13Model Parallelism With Subnetwork Data Parallelism2025-07-11Bradley-Terry and Multi-Objective Reward Modeling Are Complementary2025-07-10