Temporal Relational Modeling with Self-Supervision for Action Segmentation

Dong Wang, Di Hu, Xingjian Li, Dejing Dou

2020-12-14Action Segmentation Action Recognition Action Understanding

Abstract

Temporal relational modeling in video is essential for human action understanding, such as action recognition and action segmentation. Although Graph Convolution Networks (GCNs) have shown promising advantages in relation reasoning on many tasks, it is still a challenge to apply graph convolution networks on long video sequences effectively. The main reason is that large number of nodes (i.e., video frames) makes GCNs hard to capture and model temporal relations in videos. To tackle this problem, in this paper, we introduce an effective GCN module, Dilated Temporal Graph Reasoning Module (DTGRM), designed to model temporal relations and dependencies between video frames at various time spans. In particular, we capture and model temporal relations via constructing multi-level dilated temporal graphs where the nodes represent frames from different moments in video. Moreover, to enhance temporal reasoning ability of the proposed model, an auxiliary self-supervised task is proposed to encourage the dilated temporal graph reasoning module to find and correct wrong temporal relations in videos. Our DTGRM model outperforms state-of-the-art action segmentation models on three challenging datasets: 50Salads, Georgia Tech Egocentric Activities (GTEA), and the Breakfast dataset. The code is available at https://github.com/redwang/DTGRM.

Results

Task	Dataset	Metric	Value	Model
Action Localization	50 Salads	Acc	80	DTGRM
Action Localization	50 Salads	Edit	72	DTGRM
Action Localization	50 Salads	F1@10%	79.1	DTGRM
Action Localization	50 Salads	F1@25%	75.9	DTGRM
Action Localization	50 Salads	F1@50%	66.1	DTGRM
Action Localization	Breakfast	Acc	68.3	DTGRM
Action Localization	Breakfast	Average F1	59.1	DTGRM
Action Localization	Breakfast	Edit	68.9	DTGRM
Action Localization	Breakfast	F1@10%	68.7	DTGRM
Action Localization	Breakfast	F1@25%	61.9	DTGRM
Action Localization	Breakfast	F1@50%	46.6	DTGRM
Action Segmentation	50 Salads	Acc	80	DTGRM
Action Segmentation	50 Salads	Edit	72	DTGRM
Action Segmentation	50 Salads	F1@10%	79.1	DTGRM
Action Segmentation	50 Salads	F1@25%	75.9	DTGRM
Action Segmentation	50 Salads	F1@50%	66.1	DTGRM
Action Segmentation	Breakfast	Acc	68.3	DTGRM
Action Segmentation	Breakfast	Average F1	59.1	DTGRM
Action Segmentation	Breakfast	Edit	68.9	DTGRM
Action Segmentation	Breakfast	F1@10%	68.7	DTGRM
Action Segmentation	Breakfast	F1@25%	61.9	DTGRM
Action Segmentation	Breakfast	F1@50%	46.6	DTGRM

Temporal Relational Modeling with Self-Supervision for Action Segmentation

Abstract

Results

Related Papers

Temporal Relational Modeling with Self-Supervision for Action Segmentation

Abstract

Results

Related Papers