Treating Motion as Option with Output Selection for Unsupervised Video Object Segmentation

Suhwan Cho, Minhyeok Lee, Jungho Lee, MyeongAh Cho, Sangyoun Lee

2023-09-26Unsupervised Video Object Segmentation Optical Flow Estimation Semantic Segmentation Video Object Segmentation Video Semantic Segmentation

Paper PDF Code(official)

Abstract

Unsupervised video object segmentation (VOS) is a task that aims to detect the most salient object in a video without external guidance about the object. To leverage the property that salient objects usually have distinctive movements compared to the background, recent methods collaboratively use motion cues extracted from optical flow maps with appearance cues extracted from RGB images. However, as optical flow maps are usually very relevant to segmentation masks, the network is easy to be learned overly dependent on the motion cues during network training. As a result, such two-stream approaches are vulnerable to confusing motion cues, making their prediction unstable. To relieve this issue, we design a novel motion-as-option network by treating motion cues as optional. During network training, RGB images are randomly provided to the motion encoder instead of optical flow maps, to implicitly reduce motion dependency of the network. As the learned motion encoder can deal with both RGB images and optical flow maps, two different predictions can be generated depending on which source information is used as motion input. In order to fully exploit this property, we also propose an adaptive output selection algorithm to adopt optimal prediction result at test time. Our proposed approach affords state-of-the-art performance on all public benchmark datasets, even maintaining real-time inference speed.

Results

Task	Dataset	Metric	Value	Model
Video	YouTube-Objects	J	73.5	TMO++ (MiT-b1, MS)
Video	YouTube-Objects	J	73.1	TMO++ (RN-101)
Video	YouTube-Objects	J	73	TMO++ (MiT-b1)
Video	FBMS test	J	83.2	TMO++ (MiT-b1)
Video	FBMS test	J	81.2	TMO++ (RN-101)
Video Object Segmentation	YouTube-Objects	J	73.5	TMO++ (MiT-b1, MS)
Video Object Segmentation	YouTube-Objects	J	73.1	TMO++ (RN-101)
Video Object Segmentation	YouTube-Objects	J	73	TMO++ (MiT-b1)
Video Object Segmentation	FBMS test	J	83.2	TMO++ (MiT-b1)
Video Object Segmentation	FBMS test	J	81.2	TMO++ (RN-101)

Treating Motion as Option with Output Selection for Unsupervised Video Object Segmentation

Abstract

Results

Related Papers

Treating Motion as Option with Output Selection for Unsupervised Video Object Segmentation

Abstract

Results

Related Papers