TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/CAT-Seg: Cost Aggregation for Open-Vocabulary Semantic Seg...

CAT-Seg: Cost Aggregation for Open-Vocabulary Semantic Segmentation

Seokju Cho, Heeseong Shin, Sunghwan Hong, Anurag Arnab, Paul Hongsuck Seo, Seungryong Kim

2023-03-21CVPR 2024 1Open Vocabulary Semantic SegmentationSegmentationtext similaritySemantic SegmentationOpen-Vocabulary Semantic SegmentationImage Segmentation
PaperPDFCodeCode(official)Code

Abstract

Open-vocabulary semantic segmentation presents the challenge of labeling each pixel within an image based on a wide range of text descriptions. In this work, we introduce a novel cost-based approach to adapt vision-language foundation models, notably CLIP, for the intricate task of semantic segmentation. Through aggregating the cosine similarity score, i.e., the cost volume between image and text embeddings, our method potently adapts CLIP for segmenting seen and unseen classes by fine-tuning its encoders, addressing the challenges faced by existing methods in handling unseen classes. Building upon this, we explore methods to effectively aggregate the cost volume considering its multi-modal nature of being established between image and text embeddings. Furthermore, we examine various methods for efficiently fine-tuning CLIP.

Results

TaskDatasetMetricValueModel
Open Vocabulary Semantic SegmentationADE20K-847mIoU16CAT-Seg
Open Vocabulary Semantic SegmentationPascalVOC-20bmIoU82.5CAT-Seg
Open Vocabulary Semantic SegmentationPASCAL Context-459mIoU23.8CAT-Seg
Open Vocabulary Semantic SegmentationPascalVOC-20mIoU97CAT-Seg
Open Vocabulary Semantic SegmentationPASCAL Context-59mIoU63.3CAT-Seg
Open Vocabulary Semantic SegmentationADE20K-150mIoU37.9CAT-Seg

Related Papers

SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction2025-07-21Deep Learning-Based Fetal Lung Segmentation from Diffusion-weighted MRI Images and Lung Maturity Evaluation for Fetal Growth Restriction2025-07-17DiffOSeg: Omni Medical Image Segmentation via Multi-Expert Collaboration Diffusion Model2025-07-17From Variability To Accuracy: Conditional Bernoulli Diffusion Models with Consensus-Driven Correction for Thin Structure Segmentation2025-07-17Unleashing Vision Foundation Models for Coronary Artery Segmentation: Parallel ViT-CNN Encoding and Variational Fusion2025-07-17SCORE: Scene Context Matters in Open-Vocabulary Remote Sensing Instance Segmentation2025-07-17Unified Medical Image Segmentation with State Space Modeling Snake2025-07-17A Privacy-Preserving Semantic-Segmentation Method Using Domain-Adaptation Technique2025-07-17