Semantic Video Segmentation by Gated Recurrent Flow Propagation

David Nilsson, Cristian Sminchisescu

2016-12-28CVPR 2018 6Optical Flow Estimation Segmentation Semantic Segmentation Video Segmentation Video Semantic Segmentation

Paper PDF

Abstract

Semantic video segmentation is challenging due to the sheer amount of data that needs to be processed and labeled in order to construct accurate models. In this paper we present a deep, end-to-end trainable methodology to video segmentation that is capable of leveraging information present in unlabeled data in order to improve semantic estimates. Our model combines a convolutional architecture and a spatio-temporal transformer recurrent layer that are able to temporally propagate labeling information by means of optical flow, adaptively gated based on its locally estimated uncertainty. The flow, the recognition and the gated temporal propagation modules can be trained jointly, end-to-end. The temporal, gated recurrent flow propagation component of our model can be plugged into any static semantic segmentation architecture and turn it into a weakly supervised video processing one. Our extensive experiments in the challenging CityScapes and Camvid datasets, and based on multiple deep architectures, indicate that the resulting model can leverage unlabeled temporal frames, next to a labeled one, in order to improve both the video segmentation accuracy and the consistency of its temporal labeling, at no additional annotation cost and with little extra computation.

Results

Task	Dataset	Metric	Value	Model
Scene Parsing	Cityscapes val	mIoU	73.6	GRFP [15]
Scene Parsing	CamVid	Mean IoU	67.1	GRFP
Video Semantic Segmentation	Cityscapes val	mIoU	73.6	GRFP [15]
Video Semantic Segmentation	CamVid	Mean IoU	67.1	GRFP
Scene Understanding	Cityscapes val	mIoU	73.6	GRFP [15]
Scene Understanding	CamVid	Mean IoU	67.1	GRFP
2D Semantic Segmentation	Cityscapes val	mIoU	73.6	GRFP [15]
2D Semantic Segmentation	CamVid	Mean IoU	67.1	GRFP

Related Papers

SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction2025-07-21 Channel-wise Motion Features for Efficient Motion Segmentation2025-07-17 Deep Learning-Based Fetal Lung Segmentation from Diffusion-weighted MRI Images and Lung Maturity Evaluation for Fetal Growth Restriction2025-07-17 DiffOSeg: Omni Medical Image Segmentation via Multi-Expert Collaboration Diffusion Model2025-07-17 From Variability To Accuracy: Conditional Bernoulli Diffusion Models with Consensus-Driven Correction for Thin Structure Segmentation2025-07-17 Unleashing Vision Foundation Models for Coronary Artery Segmentation: Parallel ViT-CNN Encoding and Variational Fusion2025-07-17 SCORE: Scene Context Matters in Open-Vocabulary Remote Sensing Instance Segmentation2025-07-17 Unified Medical Image Segmentation with State Space Modeling Snake2025-07-17