Self-Supervised Learning for Semi-Supervised Temporal Action Proposal

Xiang Wang, Shiwei Zhang, Zhiwu Qing, Yuanjie Shao, Changxin Gao, Nong Sang

2021-04-07CVPR 2021 1Self-Supervised Learning Semi-Supervised Action Detection Temporal Action Localization

Abstract

Self-supervised learning presents a remarkable performance to utilize unlabeled data for various video tasks. In this paper, we focus on applying the power of self-supervised methods to improve semi-supervised action proposal generation. Particularly, we design an effective Self-supervised Semi-supervised Temporal Action Proposal (SSTAP) framework. The SSTAP contains two crucial branches, i.e., temporal-aware semi-supervised branch and relation-aware self-supervised branch. The semi-supervised branch improves the proposal model by introducing two temporal perturbations, i.e., temporal feature shift and temporal feature flip, in the mean teacher framework. The self-supervised branch defines two pretext tasks, including masked feature reconstruction and clip-order prediction, to learn the relation of temporal clues. By this means, SSTAP can better explore unlabeled videos, and improve the discriminative abilities of learned action features. We extensively evaluate the proposed SSTAP on THUMOS14 and ActivityNet v1.3 datasets. The experimental results demonstrate that SSTAP significantly outperforms state-of-the-art semi-supervised methods and even matches fully-supervised methods. Code is available at https://github.com/wangxiang1230/SSTAP.

Results

Task	Dataset	Metric	Value	Model
Video	ActivityNet-1.3	mAP	34.48	SSTAP@100%+
Video	ActivityNet-1.3	mAP IOU@0.5	50.72	SSTAP@100%+
Video	ActivityNet-1.3	mAP IOU@0.75	35.28	SSTAP@100%+
Video	ActivityNet-1.3	mAP IOU@0.95	7.87	SSTAP@100%+
Temporal Action Localization	ActivityNet-1.3	mAP	34.48	SSTAP@100%+
Temporal Action Localization	ActivityNet-1.3	mAP IOU@0.5	50.72	SSTAP@100%+
Temporal Action Localization	ActivityNet-1.3	mAP IOU@0.75	35.28	SSTAP@100%+
Temporal Action Localization	ActivityNet-1.3	mAP IOU@0.95	7.87	SSTAP@100%+
Zero-Shot Learning	ActivityNet-1.3	mAP	34.48	SSTAP@100%+
Zero-Shot Learning	ActivityNet-1.3	mAP IOU@0.5	50.72	SSTAP@100%+
Zero-Shot Learning	ActivityNet-1.3	mAP IOU@0.75	35.28	SSTAP@100%+
Zero-Shot Learning	ActivityNet-1.3	mAP IOU@0.95	7.87	SSTAP@100%+
Action Localization	ActivityNet-1.3	mAP	34.48	SSTAP@100%+
Action Localization	ActivityNet-1.3	mAP IOU@0.5	50.72	SSTAP@100%+
Action Localization	ActivityNet-1.3	mAP IOU@0.75	35.28	SSTAP@100%+
Action Localization	ActivityNet-1.3	mAP IOU@0.95	7.87	SSTAP@100%+

Self-Supervised Learning for Semi-Supervised Temporal Action Proposal

Abstract

Results

Related Papers

Self-Supervised Learning for Semi-Supervised Temporal Action Proposal

Abstract

Results

Related Papers