View Adaptive Recurrent Neural Networks for High Performance Human Action Recognition from Skeleton Data

Pengfei Zhang, Cuiling Lan, Junliang Xing, Wen-Jun Zeng, Jianru Xue, Nanning Zheng

2017-03-24ICCV 2017 10Skeleton Based Action Recognition Action Recognition Temporal Action Localization

Abstract

Skeleton-based human action recognition has recently attracted increasing attention due to the popularity of 3D skeleton data. One main challenge lies in the large view variations in captured human actions. We propose a novel view adaptation scheme to automatically regulate observation viewpoints during the occurrence of an action. Rather than re-positioning the skeletons based on a human defined prior criterion, we design a view adaptive recurrent neural network (RNN) with LSTM architecture, which enables the network itself to adapt to the most suitable observation viewpoints from end to end. Extensive experiment analyses show that the proposed view adaptive RNN model strives to (1) transform the skeletons of various views to much more consistent viewpoints and (2) maintain the continuity of the action rather than transforming every frame to the same position with the same body orientation. Our model achieves significant improvement over the state-of-the-art approaches on three benchmark datasets.

Results

Task	Dataset	Metric	Value	Model
Video	NTU RGB+D	Accuracy (CS)	79.2	VA-LSTM
Video	NTU RGB+D	Accuracy (CV)	87.6	VA-LSTM
Temporal Action Localization	NTU RGB+D	Accuracy (CS)	79.2	VA-LSTM
Temporal Action Localization	NTU RGB+D	Accuracy (CV)	87.6	VA-LSTM
Zero-Shot Learning	NTU RGB+D	Accuracy (CS)	79.2	VA-LSTM
Zero-Shot Learning	NTU RGB+D	Accuracy (CV)	87.6	VA-LSTM
Activity Recognition	NTU RGB+D	Accuracy (CS)	79.2	VA-LSTM
Activity Recognition	NTU RGB+D	Accuracy (CV)	87.6	VA-LSTM
Action Localization	NTU RGB+D	Accuracy (CS)	79.2	VA-LSTM
Action Localization	NTU RGB+D	Accuracy (CV)	87.6	VA-LSTM
Action Detection	NTU RGB+D	Accuracy (CS)	79.2	VA-LSTM
Action Detection	NTU RGB+D	Accuracy (CV)	87.6	VA-LSTM
3D Action Recognition	NTU RGB+D	Accuracy (CS)	79.2	VA-LSTM
3D Action Recognition	NTU RGB+D	Accuracy (CV)	87.6	VA-LSTM
Action Recognition	NTU RGB+D	Accuracy (CS)	79.2	VA-LSTM
Action Recognition	NTU RGB+D	Accuracy (CV)	87.6	VA-LSTM

View Adaptive Recurrent Neural Networks for High Performance Human Action Recognition from Skeleton Data

Abstract

Results

Related Papers

View Adaptive Recurrent Neural Networks for High Performance Human Action Recognition from Skeleton Data

Abstract

Results

Related Papers