Boyuan Sun, YuQi Yang, Le Zhang, Ming-Ming Cheng, Qibin Hou
This paper presents a simple but performant semi-supervised semantic segmentation approach, called CorrMatch. Previous approaches mostly employ complicated training strategies to leverage unlabeled data but overlook the role of correlation maps in modeling the relationships between pairs of locations. We observe that the correlation maps not only enable clustering pixels of the same category easily but also contain good shape information, which previous works have omitted. Motivated by these, we aim to improve the use efficiency of unlabeled data by designing two novel label propagation strategies. First, we propose to conduct pixel propagation by modeling the pairwise similarities of pixels to spread the high-confidence pixels and dig out more. Then, we perform region propagation to enhance the pseudo labels with accurate class-agnostic masks extracted from the correlation maps. CorrMatch achieves great performance on popular segmentation benchmarks. Taking the DeepLabV3+ with ResNet-101 backbone as our segmentation model, we receive a 76%+ mIoU score on the Pascal VOC 2012 dataset with only 92 annotated images. Code is available at https://github.com/BBBBchan/CorrMatch.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Semantic Segmentation | Pascal VOC 2012 6.25% labeled | Validation mIoU | 81.3 | CorrMatch (Deeplabv3+ with ResNet-101) |
| Semantic Segmentation | PASCAL VOC 2012 92 labeled | Validation mIoU | 76.4 | CorrMatch (Deeplabv3+ with ResNet-101) |
| Semantic Segmentation | PASCAL VOC 2012 732 labeled | Validation mIoU | 80.6 | CorrMatch (Deeplabv3+ with ResNet-101) |
| Semantic Segmentation | PASCAL VOC 2012 1464 labels | Validation mIoU | 81.8 | CorrMatch (Deeplabv3+ with ResNet-101) |
| Semantic Segmentation | PASCAL VOC 2012 25% labeled | Validation mIoU | 80.9 | CorrMatch (Deeplabv3+ with ResNet-101) |
| Semantic Segmentation | PASCAL VOC 2012 366 labeled | Validation mIoU | 79.4 | CorrMatch (Deeplabv3+ with ResNet-101) |
| Semantic Segmentation | Cityscapes 6.25% labeled | Validation mIoU | 77.3 | CorrMatch (Deeplabv3+ with ResNet-101) |
| Semantic Segmentation | PASCAL VOC 2012 183 labeled | Validation mIoU | 78.5 | CorrMatch (Deeplabv3+ with ResNet-101) |
| 10-shot image generation | Pascal VOC 2012 6.25% labeled | Validation mIoU | 81.3 | CorrMatch (Deeplabv3+ with ResNet-101) |
| 10-shot image generation | PASCAL VOC 2012 92 labeled | Validation mIoU | 76.4 | CorrMatch (Deeplabv3+ with ResNet-101) |
| 10-shot image generation | PASCAL VOC 2012 732 labeled | Validation mIoU | 80.6 | CorrMatch (Deeplabv3+ with ResNet-101) |
| 10-shot image generation | PASCAL VOC 2012 1464 labels | Validation mIoU | 81.8 | CorrMatch (Deeplabv3+ with ResNet-101) |
| 10-shot image generation | PASCAL VOC 2012 25% labeled | Validation mIoU | 80.9 | CorrMatch (Deeplabv3+ with ResNet-101) |
| 10-shot image generation | PASCAL VOC 2012 366 labeled | Validation mIoU | 79.4 | CorrMatch (Deeplabv3+ with ResNet-101) |
| 10-shot image generation | Cityscapes 6.25% labeled | Validation mIoU | 77.3 | CorrMatch (Deeplabv3+ with ResNet-101) |
| 10-shot image generation | PASCAL VOC 2012 183 labeled | Validation mIoU | 78.5 | CorrMatch (Deeplabv3+ with ResNet-101) |