TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Foundation Model Assisted Weakly Supervised Semantic Segme...

Foundation Model Assisted Weakly Supervised Semantic Segmentation

Xiaobo Yang, Xiaojin Gong

2023-12-06Weakly-Supervised Semantic SegmentationImage ClassificationWeakly supervised Semantic SegmentationSegmentationSemantic Segmentation
PaperPDFCode(official)

Abstract

This work aims to leverage pre-trained foundation models, such as contrastive language-image pre-training (CLIP) and segment anything model (SAM), to address weakly supervised semantic segmentation (WSSS) using image-level labels. To this end, we propose a coarse-to-fine framework based on CLIP and SAM for generating high-quality segmentation seeds. Specifically, we construct an image classification task and a seed segmentation task, which are jointly performed by CLIP with frozen weights and two sets of learnable task-specific prompts. A SAM-based seeding (SAMS) module is designed and applied to each task to produce either coarse or fine seed maps. Moreover, we design a multi-label contrastive loss supervised by image-level labels and a CAM activation loss supervised by the generated coarse seed map. These losses are used to learn the prompts, which are the only parts need to be learned in our framework. Once the prompts are learned, we input each image along with the learned segmentation-specific prompts into CLIP and the SAMS module to produce high-quality segmentation seeds. These seeds serve as pseudo labels to train an off-the-shelf segmentation network like other two-stage WSSS methods. Experiments show that our method achieves the state-of-the-art performance on PASCAL VOC 2012 and competitive results on MS COCO 2014. Code is available at https://github.com/HAL-42/FMA-WSSS.git.

Results

TaskDatasetMetricValueModel
Semantic SegmentationCOCO 2014 valmIoU55.4FMA-WSSS (Swin-L)
Semantic SegmentationPASCAL VOC 2012 trainMean IoU80.4FMA-WSSS
Semantic SegmentationPASCAL VOC 2012 valMean IoU82.6FMA-WSSS (Swin-L)
Semantic SegmentationPASCAL VOC 2012 testMean IoU81.6FMA-WSSS (Swin-L)
10-shot image generationCOCO 2014 valmIoU55.4FMA-WSSS (Swin-L)
10-shot image generationPASCAL VOC 2012 trainMean IoU80.4FMA-WSSS
10-shot image generationPASCAL VOC 2012 valMean IoU82.6FMA-WSSS (Swin-L)
10-shot image generationPASCAL VOC 2012 testMean IoU81.6FMA-WSSS (Swin-L)

Related Papers

SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction2025-07-21Automatic Classification and Segmentation of Tunnel Cracks Based on Deep Learning and Visual Explanations2025-07-18Adversarial attacks to image classification systems using evolutionary algorithms2025-07-17Efficient Adaptation of Pre-trained Vision Transformer underpinned by Approximately Orthogonal Fine-Tuning Strategy2025-07-17Federated Learning for Commercial Image Sources2025-07-17MUPAX: Multidimensional Problem Agnostic eXplainable AI2025-07-17Deep Learning-Based Fetal Lung Segmentation from Diffusion-weighted MRI Images and Lung Maturity Evaluation for Fetal Growth Restriction2025-07-17DiffOSeg: Omni Medical Image Segmentation via Multi-Expert Collaboration Diffusion Model2025-07-17