Lucid Data Dreaming for Video Object Segmentation

Anna Khoreva, Rodrigo Benenson, Eddy Ilg, Thomas Brox, Bernt Schiele

2017-03-28Semi-Supervised Video Object Segmentation Segmentation Semantic Segmentation Video Object Segmentation Object Tracking Video Semantic Segmentation Multiple Object Tracking

Paper PDF Code Code Code Code

Abstract

Convolutional networks reach top quality in pixel-level video object segmentation but require a large amount of training data (1k~100k) to deliver such results. We propose a new training strategy which achieves state-of-the-art results across three evaluation datasets while using 20x~1000x less annotated data than competing methods. Our approach is suitable for both single and multiple object segmentation. Instead of using large training sets hoping to generalize across domains, we generate in-domain training data using the provided annotation on the first frame of each video to synthesize ("lucid dream") plausible future video frames. In-domain per-video training data allows us to train high quality appearance- and motion-based models, as well as tune the post-processing stage. This approach allows to reach competitive results even when training from only a single annotated frame, without ImageNet pre-training. Our results indicate that using a larger training set is not automatically better, and that for the video object segmentation task a smaller training set that is closer to the target domain is more effective. This changes the mindset regarding how many training samples and general "objectness" knowledge are required for the video object segmentation task.

Results

Task	Dataset	Metric	Value	Model
Video	DAVIS 2016	F-measure (Decay)	9.7	Lucid
Video	DAVIS 2016	F-measure (Mean)	82	Lucid
Video	DAVIS 2016	F-measure (Recall)	88.1	Lucid
Video	DAVIS 2016	J&F	82.95	Lucid
Video	DAVIS 2016	Jaccard (Decay)	9.1	Lucid
Video	DAVIS 2016	Jaccard (Mean)	83.9	Lucid
Video	DAVIS 2016	Jaccard (Recall)	95	Lucid
Video	DAVIS 2017 (test-dev)	F-measure (Decay)	19.5	Lucid
Video	DAVIS 2017 (test-dev)	F-measure (Mean)	69.9	Lucid
Video	DAVIS 2017 (test-dev)	F-measure (Recall)	80.1	Lucid
Video	DAVIS 2017 (test-dev)	J&F	66.6	Lucid
Video	DAVIS 2017 (test-dev)	Jaccard (Decay)	19.5	Lucid
Video	DAVIS 2017 (test-dev)	Jaccard (Mean)	63.4	Lucid
Video	DAVIS 2017 (test-dev)	Jaccard (Recall)	74	Lucid
Video Object Segmentation	DAVIS 2016	F-measure (Decay)	9.7	Lucid
Video Object Segmentation	DAVIS 2016	F-measure (Mean)	82	Lucid
Video Object Segmentation	DAVIS 2016	F-measure (Recall)	88.1	Lucid
Video Object Segmentation	DAVIS 2016	J&F	82.95	Lucid
Video Object Segmentation	DAVIS 2016	Jaccard (Decay)	9.1	Lucid
Video Object Segmentation	DAVIS 2016	Jaccard (Mean)	83.9	Lucid
Video Object Segmentation	DAVIS 2016	Jaccard (Recall)	95	Lucid
Video Object Segmentation	DAVIS 2017 (test-dev)	F-measure (Decay)	19.5	Lucid
Video Object Segmentation	DAVIS 2017 (test-dev)	F-measure (Mean)	69.9	Lucid
Video Object Segmentation	DAVIS 2017 (test-dev)	F-measure (Recall)	80.1	Lucid
Video Object Segmentation	DAVIS 2017 (test-dev)	J&F	66.6	Lucid
Video Object Segmentation	DAVIS 2017 (test-dev)	Jaccard (Decay)	19.5	Lucid
Video Object Segmentation	DAVIS 2017 (test-dev)	Jaccard (Mean)	63.4	Lucid
Video Object Segmentation	DAVIS 2017 (test-dev)	Jaccard (Recall)	74	Lucid
Semi-Supervised Video Object Segmentation	DAVIS 2016	F-measure (Decay)	9.7	Lucid
Semi-Supervised Video Object Segmentation	DAVIS 2016	F-measure (Mean)	82	Lucid
Semi-Supervised Video Object Segmentation	DAVIS 2016	F-measure (Recall)	88.1	Lucid
Semi-Supervised Video Object Segmentation	DAVIS 2016	J&F	82.95	Lucid
Semi-Supervised Video Object Segmentation	DAVIS 2016	Jaccard (Decay)	9.1	Lucid
Semi-Supervised Video Object Segmentation	DAVIS 2016	Jaccard (Mean)	83.9	Lucid
Semi-Supervised Video Object Segmentation	DAVIS 2016	Jaccard (Recall)	95	Lucid
Semi-Supervised Video Object Segmentation	DAVIS 2017 (test-dev)	F-measure (Decay)	19.5	Lucid
Semi-Supervised Video Object Segmentation	DAVIS 2017 (test-dev)	F-measure (Mean)	69.9	Lucid
Semi-Supervised Video Object Segmentation	DAVIS 2017 (test-dev)	F-measure (Recall)	80.1	Lucid
Semi-Supervised Video Object Segmentation	DAVIS 2017 (test-dev)	J&F	66.6	Lucid
Semi-Supervised Video Object Segmentation	DAVIS 2017 (test-dev)	Jaccard (Decay)	19.5	Lucid
Semi-Supervised Video Object Segmentation	DAVIS 2017 (test-dev)	Jaccard (Mean)	63.4	Lucid
Semi-Supervised Video Object Segmentation	DAVIS 2017 (test-dev)	Jaccard (Recall)	74	Lucid

Lucid Data Dreaming for Video Object Segmentation

Abstract

Results

Related Papers

Lucid Data Dreaming for Video Object Segmentation

Abstract

Results

Related Papers