See More, Know More: Unsupervised Video Object Segmentation with Co-Attention Siamese Networks

Xiankai Lu, Wenguan Wang, Chao Ma, Jianbing Shen, Ling Shao, Fatih Porikli

2020-01-19CVPR 2019 6Unsupervised Video Object Segmentation Video Polyp Segmentation Semantic Segmentation Video Object Segmentation Video Semantic Segmentation

Paper PDF Code(official)

Abstract

We introduce a novel network, called CO-attention Siamese Network (COSNet), to address the unsupervised video object segmentation task from a holistic view. We emphasize the importance of inherent correlation among video frames and incorporate a global co-attention mechanism to improve further the state-of-the-art deep learning based solutions that primarily focus on learning discriminative foreground representations over appearance and motion in short-term temporal segments. The co-attention layers in our network provide efficient and competent stages for capturing global correlations and scene context by jointly computing and appending co-attention responses into a joint feature space. We train COSNet with pairs of video frames, which naturally augments training data and allows increased learning capacity. During the segmentation stage, the co-attention model encodes useful information by processing multiple reference frames together, which is leveraged to infer the frequently reappearing and salient foreground objects better. We propose a unified and end-to-end trainable framework where different co-attention variants can be derived for mining the rich context within videos. Our extensive experiments over three large benchmarks manifest that COSNet outperforms the current alternatives by a large margin.

Results

Task	Dataset	Metric	Value	Model
Medical Image Segmentation	SUN-SEG-Easy (Unseen)	Dice	0.596	COSNet
Medical Image Segmentation	SUN-SEG-Easy (Unseen)	S measure	0.654	COSNet
Medical Image Segmentation	SUN-SEG-Easy (Unseen)	Sensitivity	0.359	COSNet
Medical Image Segmentation	SUN-SEG-Easy (Unseen)	mean E-measure	0.6	COSNet
Medical Image Segmentation	SUN-SEG-Easy (Unseen)	mean F-measure	0.496	COSNet
Medical Image Segmentation	SUN-SEG-Easy (Unseen)	weighted F-measure	0.431	COSNet
Medical Image Segmentation	SUN-SEG-Hard (Unseen)	Dice	0.606	COSNet
Medical Image Segmentation	SUN-SEG-Hard (Unseen)	S-Measure	0.67	COSNet
Medical Image Segmentation	SUN-SEG-Hard (Unseen)	Sensitivity	0.38	COSNet
Medical Image Segmentation	SUN-SEG-Hard (Unseen)	mean E-measure	0.627	COSNet
Medical Image Segmentation	SUN-SEG-Hard (Unseen)	mean F-measure	0.506	COSNet
Medical Image Segmentation	SUN-SEG-Hard (Unseen)	weighted F-measure	0.443	COSNet
Video	DAVIS 2016 val	F	79.4	COSNet
Video	DAVIS 2016 val	G	80	COSNet
Video	DAVIS 2016 val	J	80.5	COSNet
Video	YouTube-Objects	J	70.5	COSNet
Video	FBMS test	J	75.6	COSNet
Video Object Segmentation	DAVIS 2016 val	F	79.4	COSNet
Video Object Segmentation	DAVIS 2016 val	G	80	COSNet
Video Object Segmentation	DAVIS 2016 val	J	80.5	COSNet
Video Object Segmentation	YouTube-Objects	J	70.5	COSNet
Video Object Segmentation	FBMS test	J	75.6	COSNet

See More, Know More: Unsupervised Video Object Segmentation with Co-Attention Siamese Networks

Abstract

Results

Related Papers

See More, Know More: Unsupervised Video Object Segmentation with Co-Attention Siamese Networks

Abstract

Results

Related Papers