TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/An Attention Enhanced Graph Convolutional LSTM Network for...

An Attention Enhanced Graph Convolutional LSTM Network for Skeleton-Based Action Recognition

Chenyang Si, Wentao Chen, Wei Wang, Liang Wang, Tieniu Tan

2019-02-25CVPR 2019 6Skeleton Based Action RecognitionAction RecognitionTemporal Action Localization
PaperPDF

Abstract

Skeleton-based action recognition is an important task that requires the adequate understanding of movement characteristics of a human action from the given skeleton sequence. Recent studies have shown that exploring spatial and temporal features of the skeleton sequence is vital for this task. Nevertheless, how to effectively extract discriminative spatial and temporal features is still a challenging problem. In this paper, we propose a novel Attention Enhanced Graph Convolutional LSTM Network (AGC-LSTM) for human action recognition from skeleton data. The proposed AGC-LSTM can not only capture discriminative features in spatial configuration and temporal dynamics but also explore the co-occurrence relationship between spatial and temporal domains. We also present a temporal hierarchical architecture to increases temporal receptive fields of the top AGC-LSTM layer, which boosts the ability to learn the high-level semantic representation and significantly reduces the computation cost. Furthermore, to select discriminative spatial information, the attention mechanism is employed to enhance information of key joints in each AGC-LSTM layer. Experimental results on two datasets are provided: NTU RGB+D dataset and Northwestern-UCLA dataset. The comparison results demonstrate the effectiveness of our approach and show that our approach outperforms the state-of-the-art methods on both datasets.

Results

TaskDatasetMetricValueModel
VideoNTU RGB+DAccuracy (CS)89.2AGC-LSTM (Joint&Part)
VideoNTU RGB+DAccuracy (CV)95AGC-LSTM (Joint&Part)
Temporal Action LocalizationNTU RGB+DAccuracy (CS)89.2AGC-LSTM (Joint&Part)
Temporal Action LocalizationNTU RGB+DAccuracy (CV)95AGC-LSTM (Joint&Part)
Zero-Shot LearningNTU RGB+DAccuracy (CS)89.2AGC-LSTM (Joint&Part)
Zero-Shot LearningNTU RGB+DAccuracy (CV)95AGC-LSTM (Joint&Part)
Activity RecognitionNTU RGB+DAccuracy (CS)89.2AGC-LSTM (Joint&Part)
Activity RecognitionNTU RGB+DAccuracy (CV)95AGC-LSTM (Joint&Part)
Action LocalizationNTU RGB+DAccuracy (CS)89.2AGC-LSTM (Joint&Part)
Action LocalizationNTU RGB+DAccuracy (CV)95AGC-LSTM (Joint&Part)
Action DetectionNTU RGB+DAccuracy (CS)89.2AGC-LSTM (Joint&Part)
Action DetectionNTU RGB+DAccuracy (CV)95AGC-LSTM (Joint&Part)
3D Action RecognitionNTU RGB+DAccuracy (CS)89.2AGC-LSTM (Joint&Part)
3D Action RecognitionNTU RGB+DAccuracy (CV)95AGC-LSTM (Joint&Part)
Action RecognitionNTU RGB+DAccuracy (CS)89.2AGC-LSTM (Joint&Part)
Action RecognitionNTU RGB+DAccuracy (CV)95AGC-LSTM (Joint&Part)

Related Papers

A Real-Time System for Egocentric Hand-Object Interaction Detection in Industrial Domains2025-07-17DVFL-Net: A Lightweight Distilled Video Focal Modulation Network for Spatio-Temporal Action Recognition2025-07-16Zero-shot Skeleton-based Action Recognition with Prototype-guided Feature Alignment2025-07-01EgoAdapt: Adaptive Multisensory Distillation and Policy Learning for Efficient Egocentric Perception2025-06-26Feature Hallucination for Self-supervised Action Recognition2025-06-25CARMA: Context-Aware Situational Grounding of Human-Robot Group Interactions by Combining Vision-Language Models with Object and Action Recognition2025-06-25Including Semantic Information via Word Embeddings for Skeleton-based Action Recognition2025-06-23Adapting Vision-Language Models for Evaluating World Models2025-06-22