Semantics-Guided Neural Networks for Efficient Skeleton-Based Human Action Recognition

Pengfei Zhang, Cuiling Lan, Wen-Jun Zeng, Junliang Xing, Jianru Xue, Nanning Zheng

2019-04-02CVPR 2020 6Skeleton Based Action Recognition Action Recognition Temporal Action Localization

Abstract

Skeleton-based human action recognition has attracted great interest thanks to the easy accessibility of the human skeleton data. Recently, there is a trend of using very deep feedforward neural networks to model the 3D coordinates of joints without considering the computational efficiency. In this paper, we propose a simple yet effective semantics-guided neural network (SGN) for skeleton-based action recognition. We explicitly introduce the high level semantics of joints (joint type and frame index) into the network to enhance the feature representation capability. In addition, we exploit the relationship of joints hierarchically through two modules, i.e., a joint-level module for modeling the correlations of joints in the same frame and a framelevel module for modeling the dependencies of frames by taking the joints in the same frame as a whole. A strong baseline is proposed to facilitate the study of this field. With an order of magnitude smaller model size than most previous works, SGN achieves the state-of-the-art performance on the NTU60, NTU120, and SYSU datasets. The source code is available at https://github.com/microsoft/SGN.

Results

Task	Dataset	Metric	Value	Model
Video	NTU RGB+D	Accuracy (CS)	89	SGN
Video	NTU RGB+D	Accuracy (CV)	94.5	SGN
Temporal Action Localization	NTU RGB+D	Accuracy (CS)	89	SGN
Temporal Action Localization	NTU RGB+D	Accuracy (CV)	94.5	SGN
Zero-Shot Learning	NTU RGB+D	Accuracy (CS)	89	SGN
Zero-Shot Learning	NTU RGB+D	Accuracy (CV)	94.5	SGN
Activity Recognition	NTU RGB+D	Accuracy (CS)	89	SGN
Activity Recognition	NTU RGB+D	Accuracy (CV)	94.5	SGN
Action Localization	NTU RGB+D	Accuracy (CS)	89	SGN
Action Localization	NTU RGB+D	Accuracy (CV)	94.5	SGN
Action Detection	NTU RGB+D	Accuracy (CS)	89	SGN
Action Detection	NTU RGB+D	Accuracy (CV)	94.5	SGN
3D Action Recognition	NTU RGB+D	Accuracy (CS)	89	SGN
3D Action Recognition	NTU RGB+D	Accuracy (CV)	94.5	SGN
Action Recognition	NTU RGB+D	Accuracy (CS)	89	SGN
Action Recognition	NTU RGB+D	Accuracy (CV)	94.5	SGN

Semantics-Guided Neural Networks for Efficient Skeleton-Based Human Action Recognition

Abstract

Results

Related Papers

Semantics-Guided Neural Networks for Efficient Skeleton-Based Human Action Recognition

Abstract

Results

Related Papers