Self-supervised Equivariant Attention Mechanism for Weakly Supervised Semantic Segmentation

Yude Wang, Jie Zhang, Meina Kan, Shiguang Shan, Xilin Chen

2020-04-09CVPR 2020 6Weakly-Supervised Semantic Segmentation Weakly supervised Semantic Segmentation Data Augmentation Semantic Segmentation

Paper PDF Code(official)Code

Abstract

Image-level weakly supervised semantic segmentation is a challenging problem that has been deeply studied in recent years. Most of advanced solutions exploit class activation map (CAM). However, CAMs can hardly serve as the object mask due to the gap between full and weak supervisions. In this paper, we propose a self-supervised equivariant attention mechanism (SEAM) to discover additional supervision and narrow the gap. Our method is based on the observation that equivariance is an implicit constraint in fully supervised semantic segmentation, whose pixel-level labels take the same spatial transformation as the input images during data augmentation. However, this constraint is lost on the CAMs trained by image-level supervision. Therefore, we propose consistency regularization on predicted CAMs from various transformed images to provide self-supervision for network learning. Moreover, we propose a pixel correlation module (PCM), which exploits context appearance information and refines the prediction of current pixel by its similar neighbors, leading to further improvement on CAMs consistency. Extensive experiments on PASCAL VOC 2012 dataset demonstrate our method outperforms state-of-the-art methods using the same level of supervision. The code is released online.

Results

Task	Dataset	Metric	Value	Model
Semantic Segmentation	PASCAL VOC 2012 val	Mean IoU	64.5	SEAM-ResNet-38
10-shot image generation	PASCAL VOC 2012 val	Mean IoU	64.5	SEAM-ResNet-38

Related Papers

SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction2025-07-21 Overview of the TalentCLEF 2025: Skill and Job Title Intelligence for Human Capital Management2025-07-17 Pixel Perfect MegaMed: A Megapixel-Scale Vision-Language Foundation Model for Generating High Resolution Medical Images2025-07-17 DiffOSeg: Omni Medical Image Segmentation via Multi-Expert Collaboration Diffusion Model2025-07-17 SCORE: Scene Context Matters in Open-Vocabulary Remote Sensing Instance Segmentation2025-07-17 Unified Medical Image Segmentation with State Space Modeling Snake2025-07-17 A Privacy-Preserving Semantic-Segmentation Method Using Domain-Adaptation Technique2025-07-17 Similarity-Guided Diffusion for Contrastive Sequential Recommendation2025-07-16