Yongfei Liu, Xiangyi Zhang, Songyang Zhang, Xuming He
Few-shot semantic segmentation aims to learn to segment new object classes with only a few annotated examples, which has a wide range of real-world applications. Most existing methods either focus on the restrictive setting of one-way few-shot segmentation or suffer from incomplete coverage of object regions. In this paper, we propose a novel few-shot semantic segmentation framework based on the prototype representation. Our key idea is to decompose the holistic class representation into a set of part-aware prototypes, capable of capturing diverse and fine-grained object features. In addition, we propose to leverage unlabeled data to enrich our part-aware prototypes, resulting in better modeling of intra-class variations of semantic objects. We develop a novel graph neural network model to generate and enhance the proposed part-aware prototypes based on labeled and unlabeled images. Extensive experimental evaluations on two benchmarks show that our method outperforms the prior art with a sizable margin.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Few-Shot Learning | COCO-20i (5-shot) | Mean IoU | 38.5 | PPNet (ResNet-50) |
| Few-Shot Learning | COCO-20i (5-shot) | learnable parameters (million) | 31.5 | PPNet (ResNet-50) |
| Few-Shot Learning | COCO-20i (2-way 1-shot) | mIoU | 20.4 | PPNet (ResNet-50) |
| Few-Shot Learning | PASCAL-5i (1-Shot) | Mean IoU | 51.5 | PPNet (ResNet-50) |
| Few-Shot Learning | PASCAL-5i (1-Shot) | learnable parameters (million) | 31.5 | PPNet (ResNet-50) |
| Few-Shot Learning | COCO-20i (1-shot) | Mean IoU | 29 | PPNet (ResNet-50) |
| Few-Shot Learning | COCO-20i (1-shot) | learnable parameters (million) | 31.5 | PPNet (ResNet-50) |
| Few-Shot Learning | Pascal5i | meanIOU | 55.16 | PPNet |
| Few-Shot Learning | PASCAL-5i (5-Shot) | Mean IoU | 62 | PPNet (ResNet-50) |
| Few-Shot Learning | PASCAL-5i (5-Shot) | learnable parameters (million) | 31.5 | PPNet (ResNet-50) |
| Few-Shot Semantic Segmentation | COCO-20i (5-shot) | Mean IoU | 38.5 | PPNet (ResNet-50) |
| Few-Shot Semantic Segmentation | COCO-20i (5-shot) | learnable parameters (million) | 31.5 | PPNet (ResNet-50) |
| Few-Shot Semantic Segmentation | COCO-20i (2-way 1-shot) | mIoU | 20.4 | PPNet (ResNet-50) |
| Few-Shot Semantic Segmentation | PASCAL-5i (1-Shot) | Mean IoU | 51.5 | PPNet (ResNet-50) |
| Few-Shot Semantic Segmentation | PASCAL-5i (1-Shot) | learnable parameters (million) | 31.5 | PPNet (ResNet-50) |
| Few-Shot Semantic Segmentation | COCO-20i (1-shot) | Mean IoU | 29 | PPNet (ResNet-50) |
| Few-Shot Semantic Segmentation | COCO-20i (1-shot) | learnable parameters (million) | 31.5 | PPNet (ResNet-50) |
| Few-Shot Semantic Segmentation | Pascal5i | meanIOU | 55.16 | PPNet |
| Few-Shot Semantic Segmentation | PASCAL-5i (5-Shot) | Mean IoU | 62 | PPNet (ResNet-50) |
| Few-Shot Semantic Segmentation | PASCAL-5i (5-Shot) | learnable parameters (million) | 31.5 | PPNet (ResNet-50) |
| Meta-Learning | COCO-20i (5-shot) | Mean IoU | 38.5 | PPNet (ResNet-50) |
| Meta-Learning | COCO-20i (5-shot) | learnable parameters (million) | 31.5 | PPNet (ResNet-50) |
| Meta-Learning | COCO-20i (2-way 1-shot) | mIoU | 20.4 | PPNet (ResNet-50) |
| Meta-Learning | PASCAL-5i (1-Shot) | Mean IoU | 51.5 | PPNet (ResNet-50) |
| Meta-Learning | PASCAL-5i (1-Shot) | learnable parameters (million) | 31.5 | PPNet (ResNet-50) |
| Meta-Learning | COCO-20i (1-shot) | Mean IoU | 29 | PPNet (ResNet-50) |
| Meta-Learning | COCO-20i (1-shot) | learnable parameters (million) | 31.5 | PPNet (ResNet-50) |
| Meta-Learning | Pascal5i | meanIOU | 55.16 | PPNet |
| Meta-Learning | PASCAL-5i (5-Shot) | Mean IoU | 62 | PPNet (ResNet-50) |
| Meta-Learning | PASCAL-5i (5-Shot) | learnable parameters (million) | 31.5 | PPNet (ResNet-50) |