TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/MSDNet: Multi-Scale Decoder for Few-Shot Semantic Segmenta...

MSDNet: Multi-Scale Decoder for Few-Shot Semantic Segmentation via Transformer-Guided Prototyping

Amirreza Fateh, Mohammad Reza Mohammadi, Mohammad Reza Jahed Motlagh

2024-09-17SegmentationFew-Shot Semantic SegmentationSemantic Segmentation
PaperPDFCode(official)

Abstract

Few-shot Semantic Segmentation addresses the challenge of segmenting objects in query images with only a handful of annotated examples. However, many previous state-of-the-art methods either have to discard intricate local semantic features or suffer from high computational complexity. To address these challenges, we propose a new Few-shot Semantic Segmentation framework based on the transformer architecture. Our approach introduces the spatial transformer decoder and the contextual mask generation module to improve the relational understanding between support and query images. Moreover, we introduce a multi-scale decoder to refine the segmentation mask by incorporating features from different resolutions in a hierarchical manner. Additionally, our approach integrates global features from intermediate encoder stages to improve contextual understanding, while maintaining a lightweight structure to reduce complexity. This balance between performance and efficiency enables our method to achieve state-of-the-art results on benchmark datasets such as $PASCAL-5^i$ and $COCO-20^i$ in both 1-shot and 5-shot settings. Notably, our model with only 1.5 million parameters demonstrates competitive performance while overcoming limitations of existing methodologies. https://github.com/amirrezafateh/MSDNet

Results

TaskDatasetMetricValueModel
Few-Shot LearningCOCO-20i (5-shot)FB-IoU75.1MSDNet (ResNet-101)
Few-Shot LearningCOCO-20i (5-shot)Mean IoU55.3MSDNet (ResNet-101)
Few-Shot LearningCOCO-20i (5-shot)learnable parameters (million)1.5MSDNet (ResNet-101)
Few-Shot LearningCOCO-20i (5-shot)FB-IoU74.5MSDNet (ResNet-50)
Few-Shot LearningCOCO-20i (5-shot)Mean IoU54.5MSDNet (ResNet-50)
Few-Shot LearningCOCO-20i (5-shot)learnable parameters (million)1.5MSDNet (ResNet-50)
Few-Shot LearningCOCO-20i -> Pascal VOC (1-shot)Mean IoU73.9MSDNet (ResNet-101)
Few-Shot LearningCOCO-20i -> Pascal VOC (1-shot)Mean IoU72.1MSDNet (ResNet-50)
Few-Shot LearningPASCAL-5i (1-Shot)FB-IoU77.3MSDNet (ResNet-101)
Few-Shot LearningPASCAL-5i (1-Shot)Mean IoU64.7MSDNet (ResNet-101)
Few-Shot LearningPASCAL-5i (1-Shot)learnable parameters (million)1.5MSDNet (ResNet-101)
Few-Shot LearningPASCAL-5i (1-Shot)FB-IoU77.1MSDNet (ResNet-50)
Few-Shot LearningPASCAL-5i (1-Shot)Mean IoU64.3MSDNet (ResNet-50)
Few-Shot LearningPASCAL-5i (1-Shot)learnable parameters (million)1.5MSDNet (ResNet-50)
Few-Shot LearningCOCO-20i (1-shot)FB-IoU71.3MSDNet (ResNet-101)
Few-Shot LearningCOCO-20i (1-shot)Mean IoU48.5MSDNet (ResNet-101)
Few-Shot LearningCOCO-20i (1-shot)learnable parameters (million)1.5MSDNet (ResNet-101)
Few-Shot LearningCOCO-20i (1-shot)FB-IoU70.4MSDNet (ResNet-50)
Few-Shot LearningCOCO-20i (1-shot)Mean IoU46.5MSDNet (ResNet-50)
Few-Shot LearningCOCO-20i (1-shot)learnable parameters (million)1.5MSDNet (ResNet-50)
Few-Shot LearningPASCAL-5i (5-Shot)FB-IoU85MSDNet (ResNet-101)
Few-Shot LearningPASCAL-5i (5-Shot)Mean IoU70.8MSDNet (ResNet-101)
Few-Shot LearningPASCAL-5i (5-Shot)learnable parameters (million)1.5MSDNet (ResNet-101)
Few-Shot LearningPASCAL-5i (5-Shot)FB-IoU82.1MSDNet (ResNet-50)
Few-Shot LearningPASCAL-5i (5-Shot)Mean IoU68.7MSDNet (ResNet-50)
Few-Shot LearningPASCAL-5i (5-Shot)learnable parameters (million)1.5MSDNet (ResNet-50)
Few-Shot LearningCOCO-20i -> Pascal VOC (5-shot)Mean IoU76.4MSDNet (ResNet-101)
Few-Shot LearningCOCO-20i -> Pascal VOC (5-shot)Mean IoU74.2MSDNet (ResNet-50)
Few-Shot Semantic SegmentationCOCO-20i (5-shot)FB-IoU75.1MSDNet (ResNet-101)
Few-Shot Semantic SegmentationCOCO-20i (5-shot)Mean IoU55.3MSDNet (ResNet-101)
Few-Shot Semantic SegmentationCOCO-20i (5-shot)learnable parameters (million)1.5MSDNet (ResNet-101)
Few-Shot Semantic SegmentationCOCO-20i (5-shot)FB-IoU74.5MSDNet (ResNet-50)
Few-Shot Semantic SegmentationCOCO-20i (5-shot)Mean IoU54.5MSDNet (ResNet-50)
Few-Shot Semantic SegmentationCOCO-20i (5-shot)learnable parameters (million)1.5MSDNet (ResNet-50)
Few-Shot Semantic SegmentationCOCO-20i -> Pascal VOC (1-shot)Mean IoU73.9MSDNet (ResNet-101)
Few-Shot Semantic SegmentationCOCO-20i -> Pascal VOC (1-shot)Mean IoU72.1MSDNet (ResNet-50)
Few-Shot Semantic SegmentationPASCAL-5i (1-Shot)FB-IoU77.3MSDNet (ResNet-101)
Few-Shot Semantic SegmentationPASCAL-5i (1-Shot)Mean IoU64.7MSDNet (ResNet-101)
Few-Shot Semantic SegmentationPASCAL-5i (1-Shot)learnable parameters (million)1.5MSDNet (ResNet-101)
Few-Shot Semantic SegmentationPASCAL-5i (1-Shot)FB-IoU77.1MSDNet (ResNet-50)
Few-Shot Semantic SegmentationPASCAL-5i (1-Shot)Mean IoU64.3MSDNet (ResNet-50)
Few-Shot Semantic SegmentationPASCAL-5i (1-Shot)learnable parameters (million)1.5MSDNet (ResNet-50)
Few-Shot Semantic SegmentationCOCO-20i (1-shot)FB-IoU71.3MSDNet (ResNet-101)
Few-Shot Semantic SegmentationCOCO-20i (1-shot)Mean IoU48.5MSDNet (ResNet-101)
Few-Shot Semantic SegmentationCOCO-20i (1-shot)learnable parameters (million)1.5MSDNet (ResNet-101)
Few-Shot Semantic SegmentationCOCO-20i (1-shot)FB-IoU70.4MSDNet (ResNet-50)
Few-Shot Semantic SegmentationCOCO-20i (1-shot)Mean IoU46.5MSDNet (ResNet-50)
Few-Shot Semantic SegmentationCOCO-20i (1-shot)learnable parameters (million)1.5MSDNet (ResNet-50)
Few-Shot Semantic SegmentationPASCAL-5i (5-Shot)FB-IoU85MSDNet (ResNet-101)
Few-Shot Semantic SegmentationPASCAL-5i (5-Shot)Mean IoU70.8MSDNet (ResNet-101)
Few-Shot Semantic SegmentationPASCAL-5i (5-Shot)learnable parameters (million)1.5MSDNet (ResNet-101)
Few-Shot Semantic SegmentationPASCAL-5i (5-Shot)FB-IoU82.1MSDNet (ResNet-50)
Few-Shot Semantic SegmentationPASCAL-5i (5-Shot)Mean IoU68.7MSDNet (ResNet-50)
Few-Shot Semantic SegmentationPASCAL-5i (5-Shot)learnable parameters (million)1.5MSDNet (ResNet-50)
Few-Shot Semantic SegmentationCOCO-20i -> Pascal VOC (5-shot)Mean IoU76.4MSDNet (ResNet-101)
Few-Shot Semantic SegmentationCOCO-20i -> Pascal VOC (5-shot)Mean IoU74.2MSDNet (ResNet-50)
Meta-LearningCOCO-20i (5-shot)FB-IoU75.1MSDNet (ResNet-101)
Meta-LearningCOCO-20i (5-shot)Mean IoU55.3MSDNet (ResNet-101)
Meta-LearningCOCO-20i (5-shot)learnable parameters (million)1.5MSDNet (ResNet-101)
Meta-LearningCOCO-20i (5-shot)FB-IoU74.5MSDNet (ResNet-50)
Meta-LearningCOCO-20i (5-shot)Mean IoU54.5MSDNet (ResNet-50)
Meta-LearningCOCO-20i (5-shot)learnable parameters (million)1.5MSDNet (ResNet-50)
Meta-LearningCOCO-20i -> Pascal VOC (1-shot)Mean IoU73.9MSDNet (ResNet-101)
Meta-LearningCOCO-20i -> Pascal VOC (1-shot)Mean IoU72.1MSDNet (ResNet-50)
Meta-LearningPASCAL-5i (1-Shot)FB-IoU77.3MSDNet (ResNet-101)
Meta-LearningPASCAL-5i (1-Shot)Mean IoU64.7MSDNet (ResNet-101)
Meta-LearningPASCAL-5i (1-Shot)learnable parameters (million)1.5MSDNet (ResNet-101)
Meta-LearningPASCAL-5i (1-Shot)FB-IoU77.1MSDNet (ResNet-50)
Meta-LearningPASCAL-5i (1-Shot)Mean IoU64.3MSDNet (ResNet-50)
Meta-LearningPASCAL-5i (1-Shot)learnable parameters (million)1.5MSDNet (ResNet-50)
Meta-LearningCOCO-20i (1-shot)FB-IoU71.3MSDNet (ResNet-101)
Meta-LearningCOCO-20i (1-shot)Mean IoU48.5MSDNet (ResNet-101)
Meta-LearningCOCO-20i (1-shot)learnable parameters (million)1.5MSDNet (ResNet-101)
Meta-LearningCOCO-20i (1-shot)FB-IoU70.4MSDNet (ResNet-50)
Meta-LearningCOCO-20i (1-shot)Mean IoU46.5MSDNet (ResNet-50)
Meta-LearningCOCO-20i (1-shot)learnable parameters (million)1.5MSDNet (ResNet-50)
Meta-LearningPASCAL-5i (5-Shot)FB-IoU85MSDNet (ResNet-101)
Meta-LearningPASCAL-5i (5-Shot)Mean IoU70.8MSDNet (ResNet-101)
Meta-LearningPASCAL-5i (5-Shot)learnable parameters (million)1.5MSDNet (ResNet-101)
Meta-LearningPASCAL-5i (5-Shot)FB-IoU82.1MSDNet (ResNet-50)
Meta-LearningPASCAL-5i (5-Shot)Mean IoU68.7MSDNet (ResNet-50)
Meta-LearningPASCAL-5i (5-Shot)learnable parameters (million)1.5MSDNet (ResNet-50)
Meta-LearningCOCO-20i -> Pascal VOC (5-shot)Mean IoU76.4MSDNet (ResNet-101)
Meta-LearningCOCO-20i -> Pascal VOC (5-shot)Mean IoU74.2MSDNet (ResNet-50)

Related Papers

SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction2025-07-21Deep Learning-Based Fetal Lung Segmentation from Diffusion-weighted MRI Images and Lung Maturity Evaluation for Fetal Growth Restriction2025-07-17DiffOSeg: Omni Medical Image Segmentation via Multi-Expert Collaboration Diffusion Model2025-07-17From Variability To Accuracy: Conditional Bernoulli Diffusion Models with Consensus-Driven Correction for Thin Structure Segmentation2025-07-17Unleashing Vision Foundation Models for Coronary Artery Segmentation: Parallel ViT-CNN Encoding and Variational Fusion2025-07-17SCORE: Scene Context Matters in Open-Vocabulary Remote Sensing Instance Segmentation2025-07-17Unified Medical Image Segmentation with State Space Modeling Snake2025-07-17A Privacy-Preserving Semantic-Segmentation Method Using Domain-Adaptation Technique2025-07-17