Masked Cross-image Encoding for Few-shot Segmentation

Wenbo Xu, Huaxi Huang, Ming Cheng, Litao Yu, Qiang Wu, Jian Zhang

2023-08-22Few-Shot Semantic Segmentation

Abstract

Few-shot segmentation (FSS) is a dense prediction task that aims to infer the pixel-wise labels of unseen classes using only a limited number of annotated images. The key challenge in FSS is to classify the labels of query pixels using class prototypes learned from the few labeled support exemplars. Prior approaches to FSS have typically focused on learning class-wise descriptors independently from support images, thereby ignoring the rich contextual information and mutual dependencies among support-query features. To address this limitation, we propose a joint learning method termed Masked Cross-Image Encoding (MCE), which is designed to capture common visual properties that describe object details and to learn bidirectional inter-image dependencies that enhance feature interaction. MCE is more than a visual representation enrichment module; it also considers cross-image mutual dependencies and implicit guidance. Experiments on FSS benchmarks PASCAL-$5^i$ and COCO-$20^i$ demonstrate the advanced meta-learning ability of the proposed method.

Results

Task	Dataset	Metric	Value	Model
Few-Shot Learning	COCO-20i (5-shot)	Mean IoU	51.04	MCE (ResNet-50)
Few-Shot Learning	PASCAL-5i (1-Shot)	FB-IoU	78.1	MCE (ResNet-50)
Few-Shot Learning	PASCAL-5i (1-Shot)	Mean IoU	65.93	MCE (ResNet-50)
Few-Shot Learning	PASCAL-5i (1-Shot)	FB-IoU	74.51	MCE (VGG-16)
Few-Shot Learning	PASCAL-5i (1-Shot)	Mean IoU	62.87	MCE (VGG-16)
Few-Shot Learning	COCO-20i (1-shot)	Mean IoU	44.22	MCE (ResNet-50)
Few-Shot Learning	PASCAL-5i (5-Shot)	FB-IoU	81.33	MCE (ResNet-50)
Few-Shot Learning	PASCAL-5i (5-Shot)	Mean IoU	70.03	MCE (ResNet-50)
Few-Shot Learning	PASCAL-5i (5-Shot)	FB-IoU	78.2	MCE (VGG-16)
Few-Shot Learning	PASCAL-5i (5-Shot)	Mean IoU	68.21	MCE (VGG-16)
Few-Shot Semantic Segmentation	COCO-20i (5-shot)	Mean IoU	51.04	MCE (ResNet-50)
Few-Shot Semantic Segmentation	PASCAL-5i (1-Shot)	FB-IoU	78.1	MCE (ResNet-50)
Few-Shot Semantic Segmentation	PASCAL-5i (1-Shot)	Mean IoU	65.93	MCE (ResNet-50)
Few-Shot Semantic Segmentation	PASCAL-5i (1-Shot)	FB-IoU	74.51	MCE (VGG-16)
Few-Shot Semantic Segmentation	PASCAL-5i (1-Shot)	Mean IoU	62.87	MCE (VGG-16)
Few-Shot Semantic Segmentation	COCO-20i (1-shot)	Mean IoU	44.22	MCE (ResNet-50)
Few-Shot Semantic Segmentation	PASCAL-5i (5-Shot)	FB-IoU	81.33	MCE (ResNet-50)
Few-Shot Semantic Segmentation	PASCAL-5i (5-Shot)	Mean IoU	70.03	MCE (ResNet-50)
Few-Shot Semantic Segmentation	PASCAL-5i (5-Shot)	FB-IoU	78.2	MCE (VGG-16)
Few-Shot Semantic Segmentation	PASCAL-5i (5-Shot)	Mean IoU	68.21	MCE (VGG-16)
Meta-Learning	COCO-20i (5-shot)	Mean IoU	51.04	MCE (ResNet-50)
Meta-Learning	PASCAL-5i (1-Shot)	FB-IoU	78.1	MCE (ResNet-50)
Meta-Learning	PASCAL-5i (1-Shot)	Mean IoU	65.93	MCE (ResNet-50)
Meta-Learning	PASCAL-5i (1-Shot)	FB-IoU	74.51	MCE (VGG-16)
Meta-Learning	PASCAL-5i (1-Shot)	Mean IoU	62.87	MCE (VGG-16)
Meta-Learning	COCO-20i (1-shot)	Mean IoU	44.22	MCE (ResNet-50)
Meta-Learning	PASCAL-5i (5-Shot)	FB-IoU	81.33	MCE (ResNet-50)
Meta-Learning	PASCAL-5i (5-Shot)	Mean IoU	70.03	MCE (ResNet-50)
Meta-Learning	PASCAL-5i (5-Shot)	FB-IoU	78.2	MCE (VGG-16)
Meta-Learning	PASCAL-5i (5-Shot)	Mean IoU	68.21	MCE (VGG-16)

Masked Cross-image Encoding for Few-shot Segmentation

Abstract

Results

Related Papers

Masked Cross-image Encoding for Few-shot Segmentation

Abstract

Results

Related Papers