Pixel-Level Matching for Video Object Segmentation using Convolutional Neural Networks

Jae Shin Yoon, Francois Rameau, Junsik Kim, Seokju Lee, Seunghak Shin, In So Kweon

2017-08-17ICCV 2017 10Visual Object Tracking Semi-Supervised Video Object Segmentation Semantic Segmentation Video Object Segmentation Video Semantic Segmentation

Paper PDF

Abstract

We propose a novel video object segmentation algorithm based on pixel-level matching using Convolutional Neural Networks (CNN). Our network aims to distinguish the target area from the background on the basis of the pixel-level similarity between two object units. The proposed network represents a target object using features from different depth layers in order to take advantage of both the spatial details and the category-level semantic information. Furthermore, we propose a feature compression technique that drastically reduces the memory requirements while maintaining the capability of feature representation. Two-stage training (pre-training and fine-tuning) allows our network to handle any target object regardless of its category (even if the object's type does not belong to the pre-training data) or of variations in its appearance through a video sequence. Experiments on large datasets demonstrate the effectiveness of our model - against related methods - in terms of accuracy, speed, and stability. Finally, we introduce the transferability of our network to different domains, such as the infrared data domain.

Results

Task	Dataset	Metric	Value	Model
Video	DAVIS 2016	F-measure (Decay)	14.7	PLM
Video	DAVIS 2016	F-measure (Mean)	62.5	PLM
Video	DAVIS 2016	F-measure (Recall)	73.2	PLM
Video	DAVIS 2016	J&F	66.35	PLM
Video	DAVIS 2016	Jaccard (Decay)	11.2	PLM
Video	DAVIS 2016	Jaccard (Mean)	70.2	PLM
Video	DAVIS 2016	Jaccard (Recall)	86.3	PLM
Video Object Segmentation	DAVIS 2016	F-measure (Decay)	14.7	PLM
Video Object Segmentation	DAVIS 2016	F-measure (Mean)	62.5	PLM
Video Object Segmentation	DAVIS 2016	F-measure (Recall)	73.2	PLM
Video Object Segmentation	DAVIS 2016	J&F	66.35	PLM
Video Object Segmentation	DAVIS 2016	Jaccard (Decay)	11.2	PLM
Video Object Segmentation	DAVIS 2016	Jaccard (Mean)	70.2	PLM
Video Object Segmentation	DAVIS 2016	Jaccard (Recall)	86.3	PLM
Semi-Supervised Video Object Segmentation	DAVIS 2016	F-measure (Decay)	14.7	PLM
Semi-Supervised Video Object Segmentation	DAVIS 2016	F-measure (Mean)	62.5	PLM
Semi-Supervised Video Object Segmentation	DAVIS 2016	F-measure (Recall)	73.2	PLM
Semi-Supervised Video Object Segmentation	DAVIS 2016	J&F	66.35	PLM
Semi-Supervised Video Object Segmentation	DAVIS 2016	Jaccard (Decay)	11.2	PLM
Semi-Supervised Video Object Segmentation	DAVIS 2016	Jaccard (Mean)	70.2	PLM
Semi-Supervised Video Object Segmentation	DAVIS 2016	Jaccard (Recall)	86.3	PLM

Pixel-Level Matching for Video Object Segmentation using Convolutional Neural Networks

Abstract

Results

Related Papers

Pixel-Level Matching for Video Object Segmentation using Convolutional Neural Networks

Abstract

Results

Related Papers