TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Structural-RNN: Deep Learning on Spatio-Temporal Graphs

Structural-RNN: Deep Learning on Spatio-Temporal Graphs

Ashesh Jain, Amir R. Zamir, Silvio Savarese, Ashutosh Saxena

2015-11-17CVPR 2016 6Human Pose ForecastingSkeleton Based Action RecognitionDeep Learning
PaperPDFCode(official)Code

Abstract

Deep Recurrent Neural Network architectures, though remarkably capable at modeling sequences, lack an intuitive high-level spatio-temporal structure. That is while many problems in computer vision inherently have an underlying high-level structure and can benefit from it. Spatio-temporal graphs are a popular tool for imposing such high-level intuitions in the formulation of real world problems. In this paper, we propose an approach for combining the power of high-level spatio-temporal graphs and sequence learning success of Recurrent Neural Networks~(RNNs). We develop a scalable method for casting an arbitrary spatio-temporal graph as a rich RNN mixture that is feedforward, fully differentiable, and jointly trainable. The proposed method is generic and principled as it can be used for transforming any spatio-temporal graph through employing a certain set of well defined steps. The evaluations of the proposed approach on a diverse set of problems, ranging from modeling human motion to object interactions, shows improvement over the state-of-the-art with a large margin. We expect this method to empower new approaches to problem formulation through high-level spatio-temporal graphs and Recurrent Neural Networks.

Results

TaskDatasetMetricValueModel
Pose EstimationHuman3.6MMAR, walking, 1,000ms2.13SRNN
Pose EstimationHuman3.6MMAR, walking, 400ms1.3SRNN
3DHuman3.6MMAR, walking, 1,000ms2.13SRNN
3DHuman3.6MMAR, walking, 400ms1.3SRNN
1 Image, 2*2 StitchiHuman3.6MMAR, walking, 1,000ms2.13SRNN
1 Image, 2*2 StitchiHuman3.6MMAR, walking, 400ms1.3SRNN

Related Papers

Automatic Classification and Segmentation of Tunnel Cracks Based on Deep Learning and Visual Explanations2025-07-18A Survey of Deep Learning for Geometry Problem Solving2025-07-16Uncertainty Quantification for Motor Imagery BCI -- Machine Learning vs. Deep Learning2025-07-10Chat-Ghosting: A Comparative Study of Methods for Auto-Completion in Dialog Systems2025-07-08Deep Learning Optimization of Two-State Pinching Antennas Systems2025-07-08AXLearn: Modular Large Model Training on Heterogeneous Infrastructure2025-07-07Determination Of Structural Cracks Using Deep Learning Frameworks2025-07-03Generalized Adaptive Transfer Network: Enhancing Transfer Learning in Reinforcement Learning Across Domains2025-07-02