Structural-RNN: Deep Learning on Spatio-Temporal Graphs

Ashesh Jain, Amir R. Zamir, Silvio Savarese, Ashutosh Saxena

2015-11-17CVPR 2016 6Human Pose Forecasting Skeleton Based Action Recognition Deep Learning

Abstract

Deep Recurrent Neural Network architectures, though remarkably capable at modeling sequences, lack an intuitive high-level spatio-temporal structure. That is while many problems in computer vision inherently have an underlying high-level structure and can benefit from it. Spatio-temporal graphs are a popular tool for imposing such high-level intuitions in the formulation of real world problems. In this paper, we propose an approach for combining the power of high-level spatio-temporal graphs and sequence learning success of Recurrent Neural Networks~(RNNs). We develop a scalable method for casting an arbitrary spatio-temporal graph as a rich RNN mixture that is feedforward, fully differentiable, and jointly trainable. The proposed method is generic and principled as it can be used for transforming any spatio-temporal graph through employing a certain set of well defined steps. The evaluations of the proposed approach on a diverse set of problems, ranging from modeling human motion to object interactions, shows improvement over the state-of-the-art with a large margin. We expect this method to empower new approaches to problem formulation through high-level spatio-temporal graphs and Recurrent Neural Networks.

Results

Task	Dataset	Metric	Value	Model
Pose Estimation	Human3.6M	MAR, walking, 1,000ms	2.13	SRNN
Pose Estimation	Human3.6M	MAR, walking, 400ms	1.3	SRNN
3D	Human3.6M	MAR, walking, 1,000ms	2.13	SRNN
3D	Human3.6M	MAR, walking, 400ms	1.3	SRNN
1 Image, 2*2 Stitchi	Human3.6M	MAR, walking, 1,000ms	2.13	SRNN
1 Image, 2*2 Stitchi	Human3.6M	MAR, walking, 400ms	1.3	SRNN

Structural-RNN: Deep Learning on Spatio-Temporal Graphs

Abstract

Results

Related Papers

Structural-RNN: Deep Learning on Spatio-Temporal Graphs

Abstract

Results

Related Papers