Learning Video Object Segmentation from Static Images

Anna Khoreva, Federico Perazzi, Rodrigo Benenson, Bernt Schiele, Alexander Sorkine-Hornung

2016-12-08CVPR 2017 7Visual Object Tracking Semi-Supervised Video Object Segmentation Segmentation Semantic Segmentation Video Object Segmentation Object Tracking Instance Segmentation Video Semantic Segmentation

Paper PDF Code Code

Abstract

Inspired by recent advances of deep learning in instance segmentation and object tracking, we introduce video object segmentation problem as a concept of guided instance segmentation. Our model proceeds on a per-frame basis, guided by the output of the previous frame towards the object of interest in the next frame. We demonstrate that highly accurate object segmentation in videos can be enabled by using a convnet trained with static images only. The key ingredient of our approach is a combination of offline and online learning strategies, where the former serves to produce a refined mask from the previous frame estimate and the latter allows to capture the appearance of the specific object instance. Our method can handle different types of input annotations: bounding boxes and segments, as well as incorporate multiple annotated frames, making the system suitable for diverse applications. We obtain competitive results on three different datasets, independently from the type of input annotation.

Results

Task	Dataset	Metric	Value	Model
Video	DAVIS 2016	F-measure (Decay)	9	MSK
Video	DAVIS 2016	F-measure (Mean)	75.4	MSK
Video	DAVIS 2016	F-measure (Recall)	87.1	MSK
Video	DAVIS 2016	J&F	77.55	MSK
Video	DAVIS 2016	Jaccard (Decay)	8.9	MSK
Video	DAVIS 2016	Jaccard (Mean)	79.7	MSK
Video	DAVIS 2016	Jaccard (Recall)	93.1	MSK
Video	YouTube	mIoU	0.726	MaskTrack
Video Object Segmentation	DAVIS 2016	F-measure (Decay)	9	MSK
Video Object Segmentation	DAVIS 2016	F-measure (Mean)	75.4	MSK
Video Object Segmentation	DAVIS 2016	F-measure (Recall)	87.1	MSK
Video Object Segmentation	DAVIS 2016	J&F	77.55	MSK
Video Object Segmentation	DAVIS 2016	Jaccard (Decay)	8.9	MSK
Video Object Segmentation	DAVIS 2016	Jaccard (Mean)	79.7	MSK
Video Object Segmentation	DAVIS 2016	Jaccard (Recall)	93.1	MSK
Video Object Segmentation	YouTube	mIoU	0.726	MaskTrack
Semi-Supervised Video Object Segmentation	DAVIS 2016	F-measure (Decay)	9	MSK
Semi-Supervised Video Object Segmentation	DAVIS 2016	F-measure (Mean)	75.4	MSK
Semi-Supervised Video Object Segmentation	DAVIS 2016	F-measure (Recall)	87.1	MSK
Semi-Supervised Video Object Segmentation	DAVIS 2016	J&F	77.55	MSK
Semi-Supervised Video Object Segmentation	DAVIS 2016	Jaccard (Decay)	8.9	MSK
Semi-Supervised Video Object Segmentation	DAVIS 2016	Jaccard (Mean)	79.7	MSK
Semi-Supervised Video Object Segmentation	DAVIS 2016	Jaccard (Recall)	93.1	MSK
Semi-Supervised Video Object Segmentation	YouTube	mIoU	0.726	MaskTrack

Learning Video Object Segmentation from Static Images

Abstract

Results

Related Papers

Learning Video Object Segmentation from Static Images

Abstract

Results

Related Papers