Railroad is not a Train: Saliency as Pseudo-pixel Supervision for Weakly Supervised Semantic Segmentation

Seungho Lee, Minhyun Lee, Jongwuk Lee, Hyunjung Shim

2021-05-19CVPR 2021 1Weakly-Supervised Semantic Segmentation Weakly supervised Semantic Segmentation Semantic Segmentation Saliency Detection

Paper PDF Code

Abstract

Existing studies in weakly-supervised semantic segmentation (WSSS) using image-level weak supervision have several limitations: sparse object coverage, inaccurate object boundaries, and co-occurring pixels from non-target objects. To overcome these challenges, we propose a novel framework, namely Explicit Pseudo-pixel Supervision (EPS), which learns from pixel-level feedback by combining two weak supervisions; the image-level label provides the object identity via the localization map and the saliency map from the off-the-shelf saliency detection model offers rich boundaries. We devise a joint training strategy to fully utilize the complementary relationship between both information. Our method can obtain accurate object boundaries and discard co-occurring pixels, thereby significantly improving the quality of pseudo-masks. Experimental results show that the proposed method remarkably outperforms existing methods by resolving key challenges of WSSS and achieves the new state-of-the-art performance on both PASCAL VOC 2012 and MS COCO 2014 datasets.

Results

Task	Dataset	Metric	Value	Model
Semantic Segmentation	COCO 2014 val	mIoU	35.7	EPS
Semantic Segmentation	PASCAL VOC 2012 val	Mean IoU	71	EPS(DeepLabV1-ResNet101)
Semantic Segmentation	PASCAL VOC 2012 val	Mean IoU	70.9	EPS(DeepLabV2-ResNet101)
Semantic Segmentation	PASCAL VOC 2012 test	Mean IoU	71.8	EPS(DeepLabV1-ResNet101
Semantic Segmentation	PASCAL VOC 2012 test	Mean IoU	70.8	EPS(DeepLabV2-ResNet101)
10-shot image generation	COCO 2014 val	mIoU	35.7	EPS
10-shot image generation	PASCAL VOC 2012 val	Mean IoU	71	EPS(DeepLabV1-ResNet101)
10-shot image generation	PASCAL VOC 2012 val	Mean IoU	70.9	EPS(DeepLabV2-ResNet101)
10-shot image generation	PASCAL VOC 2012 test	Mean IoU	71.8	EPS(DeepLabV1-ResNet101
10-shot image generation	PASCAL VOC 2012 test	Mean IoU	70.8	EPS(DeepLabV2-ResNet101)

Railroad is not a Train: Saliency as Pseudo-pixel Supervision for Weakly Supervised Semantic Segmentation

Abstract

Results

Related Papers

Railroad is not a Train: Saliency as Pseudo-pixel Supervision for Weakly Supervised Semantic Segmentation

Abstract

Results

Related Papers