TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Simpler is Better: Few-shot Semantic Segmentation with Cla...

Simpler is Better: Few-shot Semantic Segmentation with Classifier Weight Transformer

Zhihe Lu, Sen He, Xiatian Zhu, Li Zhang, Yi-Zhe Song, Tao Xiang

2021-08-06ICCV 2021 10Meta-LearningFew-Shot Semantic SegmentationSemantic Segmentation
PaperPDFCode(official)

Abstract

A few-shot semantic segmentation model is typically composed of a CNN encoder, a CNN decoder and a simple classifier (separating foreground and background pixels). Most existing methods meta-learn all three model components for fast adaptation to a new class. However, given that as few as a single support set image is available, effective model adaption of all three components to the new class is extremely challenging. In this work we propose to simplify the meta-learning task by focusing solely on the simplest component, the classifier, whilst leaving the encoder and decoder to pre-training. We hypothesize that if we pre-train an off-the-shelf segmentation model over a set of diverse training classes with sufficient annotations, the encoder and decoder can capture rich discriminative features applicable for any unseen classes, rendering the subsequent meta-learning stage unnecessary. For the classifier meta-learning, we introduce a Classifier Weight Transformer (CWT) designed to dynamically adapt the supportset trained classifier's weights to each query image in an inductive way. Extensive experiments on two standard benchmarks show that despite its simplicity, our method outperforms the state-of-the-art alternatives, often by a large margin.Code is available on https://github.com/zhiheLu/CWT-for-FSS.

Results

TaskDatasetMetricValueModel
Few-Shot LearningCOCO-20i (5-shot)Mean IoU42CWT (ResNet-101)
Few-Shot LearningCOCO-20i (5-shot)Mean IoU41.3CWT (ResNet-50)
Few-Shot LearningCOCO-20i -> Pascal VOC (1-shot)Mean IoU59.5CWT (ResNet-50)
Few-Shot LearningPASCAL-5i (1-Shot)Mean IoU58CWT (ResNet-101)
Few-Shot LearningPASCAL-5i (1-Shot)Mean IoU56.4CWT (ResNet-50)
Few-Shot LearningCOCO-20i (1-shot)Mean IoU32.9CWT (ResNet-50)
Few-Shot LearningCOCO-20i (1-shot)Mean IoU32.4CWT (ResNet-101)
Few-Shot LearningPASCAL-5i (5-Shot)Mean IoU64.7CWT (ResNet-101)
Few-Shot LearningPASCAL-5i (5-Shot)Mean IoU63.7CWT (ResNet-50)
Few-Shot LearningCOCO-20i -> Pascal VOC (5-shot)Mean IoU66.5CWT (ResNet-50)
Few-Shot Semantic SegmentationCOCO-20i (5-shot)Mean IoU42CWT (ResNet-101)
Few-Shot Semantic SegmentationCOCO-20i (5-shot)Mean IoU41.3CWT (ResNet-50)
Few-Shot Semantic SegmentationCOCO-20i -> Pascal VOC (1-shot)Mean IoU59.5CWT (ResNet-50)
Few-Shot Semantic SegmentationPASCAL-5i (1-Shot)Mean IoU58CWT (ResNet-101)
Few-Shot Semantic SegmentationPASCAL-5i (1-Shot)Mean IoU56.4CWT (ResNet-50)
Few-Shot Semantic SegmentationCOCO-20i (1-shot)Mean IoU32.9CWT (ResNet-50)
Few-Shot Semantic SegmentationCOCO-20i (1-shot)Mean IoU32.4CWT (ResNet-101)
Few-Shot Semantic SegmentationPASCAL-5i (5-Shot)Mean IoU64.7CWT (ResNet-101)
Few-Shot Semantic SegmentationPASCAL-5i (5-Shot)Mean IoU63.7CWT (ResNet-50)
Few-Shot Semantic SegmentationCOCO-20i -> Pascal VOC (5-shot)Mean IoU66.5CWT (ResNet-50)
Meta-LearningCOCO-20i (5-shot)Mean IoU42CWT (ResNet-101)
Meta-LearningCOCO-20i (5-shot)Mean IoU41.3CWT (ResNet-50)
Meta-LearningCOCO-20i -> Pascal VOC (1-shot)Mean IoU59.5CWT (ResNet-50)
Meta-LearningPASCAL-5i (1-Shot)Mean IoU58CWT (ResNet-101)
Meta-LearningPASCAL-5i (1-Shot)Mean IoU56.4CWT (ResNet-50)
Meta-LearningCOCO-20i (1-shot)Mean IoU32.9CWT (ResNet-50)
Meta-LearningCOCO-20i (1-shot)Mean IoU32.4CWT (ResNet-101)
Meta-LearningPASCAL-5i (5-Shot)Mean IoU64.7CWT (ResNet-101)
Meta-LearningPASCAL-5i (5-Shot)Mean IoU63.7CWT (ResNet-50)
Meta-LearningCOCO-20i -> Pascal VOC (5-shot)Mean IoU66.5CWT (ResNet-50)

Related Papers

SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction2025-07-21DiffOSeg: Omni Medical Image Segmentation via Multi-Expert Collaboration Diffusion Model2025-07-17SCORE: Scene Context Matters in Open-Vocabulary Remote Sensing Instance Segmentation2025-07-17Unified Medical Image Segmentation with State Space Modeling Snake2025-07-17A Privacy-Preserving Semantic-Segmentation Method Using Domain-Adaptation Technique2025-07-17Are encoders able to learn landmarkers for warm-starting of Hyperparameter Optimization?2025-07-16Imbalanced Regression Pipeline Recommendation2025-07-16CLID-MU: Cross-Layer Information Divergence Based Meta Update Strategy for Learning with Noisy Labels2025-07-16