Region-based semantic segmentation with end-to-end training

Holger Caesar, Jasper Uijlings, Vittorio Ferrari

2016-07-26Segmentation Semantic Segmentation

Abstract

We propose a novel method for semantic segmentation, the task of labeling each pixel in an image with a semantic class. Our method combines the advantages of the two main competing paradigms. Methods based on region classification offer proper spatial support for appearance measurements, but typically operate in two separate stages, none of which targets pixel labeling performance at the end of the pipeline. More recent fully convolutional methods are capable of end-to-end training for the final pixel labeling, but resort to fixed patches as spatial support. We show how to modify modern region-based approaches to enable end-to-end training for semantic segmentation. This is achieved via a differentiable region-to-pixel layer and a differentiable free-form Region-of-Interest pooling layer. Our method improves the state-of-the-art in terms of class-average accuracy with 64.0% on SIFT Flow and 49.9% on PASCAL Context, and is particularly accurate at object boundaries.

Results

Task	Dataset	Metric	Value	Model
Semantic Segmentation	SIFT-flow	Mean Accuracy	64	RBE2E
Semantic Segmentation	SIFT-flow	Pixel Accuracy	84.3	RBE2E
Semantic Segmentation	PASCAL Context	Mean Accuracy	49.9	RBE2E
Semantic Segmentation	PASCAL Context	Pixel Accuracy	62.4	RBE2E
Semantic Segmentation	PASCAL Context	mIoU	32.5	RBE2E
10-shot image generation	SIFT-flow	Mean Accuracy	64	RBE2E
10-shot image generation	SIFT-flow	Pixel Accuracy	84.3	RBE2E
10-shot image generation	PASCAL Context	Mean Accuracy	49.9	RBE2E
10-shot image generation	PASCAL Context	Pixel Accuracy	62.4	RBE2E
10-shot image generation	PASCAL Context	mIoU	32.5	RBE2E

Related Papers

SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction2025-07-21 Deep Learning-Based Fetal Lung Segmentation from Diffusion-weighted MRI Images and Lung Maturity Evaluation for Fetal Growth Restriction2025-07-17 DiffOSeg: Omni Medical Image Segmentation via Multi-Expert Collaboration Diffusion Model2025-07-17 From Variability To Accuracy: Conditional Bernoulli Diffusion Models with Consensus-Driven Correction for Thin Structure Segmentation2025-07-17 Unleashing Vision Foundation Models for Coronary Artery Segmentation: Parallel ViT-CNN Encoding and Variational Fusion2025-07-17 SCORE: Scene Context Matters in Open-Vocabulary Remote Sensing Instance Segmentation2025-07-17 Unified Medical Image Segmentation with State Space Modeling Snake2025-07-17 A Privacy-Preserving Semantic-Segmentation Method Using Domain-Adaptation Technique2025-07-17