InfoSeg: Unsupervised Semantic Image Segmentation with Mutual Information Maximization

Robert Harb, Patrick Knöbelreiter

2021-10-07Representation Learning Unsupervised Semantic Segmentation Semantic Segmentation Image Segmentation

Abstract

We propose a novel method for unsupervised semantic image segmentation based on mutual information maximization between local and global high-level image features. The core idea of our work is to leverage recent progress in self-supervised image representation learning. Representation learning methods compute a single high-level feature capturing an entire image. In contrast, we compute multiple high-level features, each capturing image segments of one particular semantic class. To this end, we propose a novel two-step learning procedure comprising a segmentation and a mutual information maximization step. In the first step, we segment images based on local and global features. In the second step, we maximize the mutual information between local features and high-level features of their respective class. For training, we provide solely unlabeled images and start from random network initialization. For quantitative and qualitative evaluation, we use established benchmarks, and COCO-Persons, whereby we introduce the latter in this paper as a challenging novel benchmark. InfoSeg significantly outperforms the current state-of-the-art, e.g., we achieve a relative increase of 26% in the Pixel Accuracy metric on the COCO-Stuff dataset.

Results

Task	Dataset	Metric	Value	Model
Semantic Segmentation	COCO-Stuff-15	Pixel Accuracy	38.8	InfoSeg
Semantic Segmentation	COCO-Stuff-3	Pixel Accuracy	73.8	InfoSeg
Semantic Segmentation	Potsdam-3	Pixel Accuracy	71.6	InfoSeg
Semantic Segmentation	COCO-Persons	Pixel Accuracy	69.6	InfoSeg
Unsupervised Semantic Segmentation	COCO-Stuff-15	Pixel Accuracy	38.8	InfoSeg
Unsupervised Semantic Segmentation	COCO-Stuff-3	Pixel Accuracy	73.8	InfoSeg
Unsupervised Semantic Segmentation	Potsdam-3	Pixel Accuracy	71.6	InfoSeg
Unsupervised Semantic Segmentation	COCO-Persons	Pixel Accuracy	69.6	InfoSeg
10-shot image generation	COCO-Stuff-15	Pixel Accuracy	38.8	InfoSeg
10-shot image generation	COCO-Stuff-3	Pixel Accuracy	73.8	InfoSeg
10-shot image generation	Potsdam-3	Pixel Accuracy	71.6	InfoSeg
10-shot image generation	COCO-Persons	Pixel Accuracy	69.6	InfoSeg

InfoSeg: Unsupervised Semantic Image Segmentation with Mutual Information Maximization

Abstract

Results

Related Papers

InfoSeg: Unsupervised Semantic Image Segmentation with Mutual Information Maximization

Abstract

Results

Related Papers