TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Few-Shot Segmentation via Cycle-Consistent Transformer

Few-Shot Segmentation via Cycle-Consistent Transformer

Gengwei Zhang, Guoliang Kang, Yi Yang, Yunchao Wei

2021-06-04NeurIPS 2021 12SegmentationFew-Shot Semantic SegmentationSemantic Segmentation
PaperPDFCode(official)Code

Abstract

Few-shot segmentation aims to train a segmentation model that can fast adapt to novel classes with few exemplars. The conventional training paradigm is to learn to make predictions on query images conditioned on the features from support images. Previous methods only utilized the semantic-level prototypes of support images as conditional information. These methods cannot utilize all pixel-wise support information for the query predictions, which is however critical for the segmentation task. In this paper, we focus on utilizing pixel-wise relationships between support and query images to facilitate the few-shot segmentation task. We design a novel Cycle-Consistent TRansformer (CyCTR) module to aggregate pixel-wise support features into query ones. CyCTR performs cross-attention between features from different images, i.e. support and query images. We observe that there may exist unexpected irrelevant pixel-level support features. Directly performing cross-attention may aggregate these features from support to query and bias the query features. Thus, we propose using a novel cycle-consistent attention mechanism to filter out possible harmful support features and encourage query features to attend to the most informative pixels from support images. Experiments on all few-shot segmentation benchmarks demonstrate that our proposed CyCTR leads to remarkable improvement compared to previous state-of-the-art methods. Specifically, on Pascal-$5^i$ and COCO-$20^i$ datasets, we achieve 67.5% and 45.6% mIoU for 5-shot segmentation, outperforming previous state-of-the-art methods by 5.6% and 7.1% respectively.

Results

TaskDatasetMetricValueModel
Few-Shot LearningCOCO-20i (5-shot)Mean IoU45.6CyCTR (ResNet-50)
Few-Shot LearningCOCO-20i (5-shot)learnable parameters (million)15.4CyCTR (ResNet-50)
Few-Shot LearningPASCAL-5i (1-Shot)Mean IoU64.3CyCTR (ResNet-101)
Few-Shot LearningPASCAL-5i (1-Shot)learnable parameters (million)15.4CyCTR (ResNet-101)
Few-Shot LearningCOCO-20i (1-shot)Mean IoU40.3CyCTR (ResNet-50)
Few-Shot LearningCOCO-20i (1-shot)learnable parameters (million)15.4CyCTR (ResNet-50)
Few-Shot LearningPASCAL-5i (5-Shot)Mean IoU66.6CyCTR (ResNet-101)
Few-Shot LearningPASCAL-5i (5-Shot)learnable parameters (million)15.4CyCTR (ResNet-101)
Few-Shot Semantic SegmentationCOCO-20i (5-shot)Mean IoU45.6CyCTR (ResNet-50)
Few-Shot Semantic SegmentationCOCO-20i (5-shot)learnable parameters (million)15.4CyCTR (ResNet-50)
Few-Shot Semantic SegmentationPASCAL-5i (1-Shot)Mean IoU64.3CyCTR (ResNet-101)
Few-Shot Semantic SegmentationPASCAL-5i (1-Shot)learnable parameters (million)15.4CyCTR (ResNet-101)
Few-Shot Semantic SegmentationCOCO-20i (1-shot)Mean IoU40.3CyCTR (ResNet-50)
Few-Shot Semantic SegmentationCOCO-20i (1-shot)learnable parameters (million)15.4CyCTR (ResNet-50)
Few-Shot Semantic SegmentationPASCAL-5i (5-Shot)Mean IoU66.6CyCTR (ResNet-101)
Few-Shot Semantic SegmentationPASCAL-5i (5-Shot)learnable parameters (million)15.4CyCTR (ResNet-101)
Meta-LearningCOCO-20i (5-shot)Mean IoU45.6CyCTR (ResNet-50)
Meta-LearningCOCO-20i (5-shot)learnable parameters (million)15.4CyCTR (ResNet-50)
Meta-LearningPASCAL-5i (1-Shot)Mean IoU64.3CyCTR (ResNet-101)
Meta-LearningPASCAL-5i (1-Shot)learnable parameters (million)15.4CyCTR (ResNet-101)
Meta-LearningCOCO-20i (1-shot)Mean IoU40.3CyCTR (ResNet-50)
Meta-LearningCOCO-20i (1-shot)learnable parameters (million)15.4CyCTR (ResNet-50)
Meta-LearningPASCAL-5i (5-Shot)Mean IoU66.6CyCTR (ResNet-101)
Meta-LearningPASCAL-5i (5-Shot)learnable parameters (million)15.4CyCTR (ResNet-101)

Related Papers

SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction2025-07-21Deep Learning-Based Fetal Lung Segmentation from Diffusion-weighted MRI Images and Lung Maturity Evaluation for Fetal Growth Restriction2025-07-17DiffOSeg: Omni Medical Image Segmentation via Multi-Expert Collaboration Diffusion Model2025-07-17From Variability To Accuracy: Conditional Bernoulli Diffusion Models with Consensus-Driven Correction for Thin Structure Segmentation2025-07-17Unleashing Vision Foundation Models for Coronary Artery Segmentation: Parallel ViT-CNN Encoding and Variational Fusion2025-07-17SCORE: Scene Context Matters in Open-Vocabulary Remote Sensing Instance Segmentation2025-07-17Unified Medical Image Segmentation with State Space Modeling Snake2025-07-17A Privacy-Preserving Semantic-Segmentation Method Using Domain-Adaptation Technique2025-07-17