TokenCut: Segmenting Objects in Images and Videos with Self-supervised Transformer and Normalized Cut

Yangtao Wang, Xi Shen, Yuan Yuan, Yuming Du, Maomao Li, Shell Xu Hu, James L Crowley, Dominique Vaufreydaz

2022-09-01Unsupervised Video Object Segmentation Unsupervised Saliency Detection Segmentation Semantic Segmentation Object Discovery Video Object Segmentation Video Semantic Segmentation Unsupervised Object Segmentation Unsupervised Instance Segmentation Saliency Detection

Paper PDF

Abstract

In this paper, we describe a graph-based algorithm that uses the features obtained by a self-supervised transformer to detect and segment salient objects in images and videos. With this approach, the image patches that compose an image or video are organised into a fully connected graph, where the edge between each pair of patches is labeled with a similarity score between patches using features learned by the transformer. Detection and segmentation of salient objects is then formulated as a graph-cut problem and solved using the classical Normalized Cut algorithm. Despite the simplicity of this approach, it achieves state-of-the-art results on several common image and video detection and segmentation tasks. For unsupervised object discovery, this approach outperforms the competing approaches by a margin of 6.1%, 5.7%, and 2.6%, respectively, when tested with the VOC07, VOC12, and COCO20K datasets. For the unsupervised saliency detection task in images, this method improves the score for Intersection over Union (IoU) by 4.4%, 5.6% and 5.2%. When tested with the ECSSD, DUTS, and DUT-OMRON datasets, respectively, compared to current state-of-the-art techniques. This method also achieves competitive results for unsupervised video object segmentation tasks with the DAVIS, SegTV2, and FBMS datasets.

Results

Task	Dataset	Metric	Value	Model
Instance Segmentation	SegTrack-v2	mIoU	59.6	TokenCut
Instance Segmentation	FBMS-59	mIoU	60.2	TokenCut
Unsupervised Object Segmentation	SegTrack-v2	mIoU	59.6	TokenCut
Unsupervised Object Segmentation	FBMS-59	mIoU	60.2	TokenCut
Unsupervised Instance Segmentation	COCO val2017	AP	2.4	TokenCut
Unsupervised Instance Segmentation	COCO val2017	AP50	4.8	TokenCut
Unsupervised Instance Segmentation	COCO val2017	AP75	1.9	TokenCut

TokenCut: Segmenting Objects in Images and Videos with Self-supervised Transformer and Normalized Cut

Abstract

Results

Related Papers

TokenCut: Segmenting Objects in Images and Videos with Self-supervised Transformer and Normalized Cut

Abstract

Results

Related Papers