TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/InfoSeg: Unsupervised Semantic Image Segmentation with Mut...

InfoSeg: Unsupervised Semantic Image Segmentation with Mutual Information Maximization

Robert Harb, Patrick Knöbelreiter

2021-10-07Representation LearningUnsupervised Semantic SegmentationSemantic SegmentationImage Segmentation
PaperPDF

Abstract

We propose a novel method for unsupervised semantic image segmentation based on mutual information maximization between local and global high-level image features. The core idea of our work is to leverage recent progress in self-supervised image representation learning. Representation learning methods compute a single high-level feature capturing an entire image. In contrast, we compute multiple high-level features, each capturing image segments of one particular semantic class. To this end, we propose a novel two-step learning procedure comprising a segmentation and a mutual information maximization step. In the first step, we segment images based on local and global features. In the second step, we maximize the mutual information between local features and high-level features of their respective class. For training, we provide solely unlabeled images and start from random network initialization. For quantitative and qualitative evaluation, we use established benchmarks, and COCO-Persons, whereby we introduce the latter in this paper as a challenging novel benchmark. InfoSeg significantly outperforms the current state-of-the-art, e.g., we achieve a relative increase of 26% in the Pixel Accuracy metric on the COCO-Stuff dataset.

Results

TaskDatasetMetricValueModel
Semantic SegmentationCOCO-Stuff-15Pixel Accuracy38.8InfoSeg
Semantic SegmentationCOCO-Stuff-3Pixel Accuracy73.8InfoSeg
Semantic SegmentationPotsdam-3Pixel Accuracy71.6InfoSeg
Semantic SegmentationCOCO-PersonsPixel Accuracy69.6InfoSeg
Unsupervised Semantic SegmentationCOCO-Stuff-15Pixel Accuracy38.8InfoSeg
Unsupervised Semantic SegmentationCOCO-Stuff-3Pixel Accuracy73.8InfoSeg
Unsupervised Semantic SegmentationPotsdam-3Pixel Accuracy71.6InfoSeg
Unsupervised Semantic SegmentationCOCO-PersonsPixel Accuracy69.6InfoSeg
10-shot image generationCOCO-Stuff-15Pixel Accuracy38.8InfoSeg
10-shot image generationCOCO-Stuff-3Pixel Accuracy73.8InfoSeg
10-shot image generationPotsdam-3Pixel Accuracy71.6InfoSeg
10-shot image generationCOCO-PersonsPixel Accuracy69.6InfoSeg

Related Papers

SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction2025-07-21Touch in the Wild: Learning Fine-Grained Manipulation with a Portable Visuo-Tactile Gripper2025-07-20Spectral Bellman Method: Unifying Representation and Exploration in RL2025-07-17Boosting Team Modeling through Tempo-Relational Representation Learning2025-07-17DiffOSeg: Omni Medical Image Segmentation via Multi-Expert Collaboration Diffusion Model2025-07-17SCORE: Scene Context Matters in Open-Vocabulary Remote Sensing Instance Segmentation2025-07-17Unified Medical Image Segmentation with State Space Modeling Snake2025-07-17A Privacy-Preserving Semantic-Segmentation Method Using Domain-Adaptation Technique2025-07-17