TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Actional-Structural Graph Convolutional Networks for Skele...

Actional-Structural Graph Convolutional Networks for Skeleton-based Action Recognition

Maosen Li, Siheng Chen, Xu Chen, Ya zhang, Yan-Feng Wang, Qi Tian

2019-04-26CVPR 2019 6Skeleton Based Action RecognitionPose PredictionAction RecognitionTemporal Action Localization
PaperPDFCode(official)

Abstract

Action recognition with skeleton data has recently attracted much attention in computer vision. Previous studies are mostly based on fixed skeleton graphs, only capturing local physical dependencies among joints, which may miss implicit joint correlations. To capture richer dependencies, we introduce an encoder-decoder structure, called A-link inference module, to capture action-specific latent dependencies, i.e. actional links, directly from actions. We also extend the existing skeleton graphs to represent higher-order dependencies, i.e. structural links. Combing the two types of links into a generalized skeleton graph, we further propose the actional-structural graph convolution network (AS-GCN), which stacks actional-structural graph convolution and temporal convolution as a basic building block, to learn both spatial and temporal features for action recognition. A future pose prediction head is added in parallel to the recognition head to help capture more detailed action patterns through self-supervision. We validate AS-GCN in action recognition using two skeleton data sets, NTU-RGB+D and Kinetics. The proposed AS-GCN achieves consistently large improvement compared to the state-of-the-art methods. As a side product, AS-GCN also shows promising results for future pose prediction.

Results

TaskDatasetMetricValueModel
VideoKinetics-Skeleton datasetAccuracy34.8AS-GCN
VideoNTU RGB+DAccuracy (CS)86.8AS-GCN
VideoNTU RGB+DAccuracy (CV)94.2AS-GCN
Temporal Action LocalizationKinetics-Skeleton datasetAccuracy34.8AS-GCN
Temporal Action LocalizationNTU RGB+DAccuracy (CS)86.8AS-GCN
Temporal Action LocalizationNTU RGB+DAccuracy (CV)94.2AS-GCN
Zero-Shot LearningKinetics-Skeleton datasetAccuracy34.8AS-GCN
Zero-Shot LearningNTU RGB+DAccuracy (CS)86.8AS-GCN
Zero-Shot LearningNTU RGB+DAccuracy (CV)94.2AS-GCN
Activity RecognitionKinetics-Skeleton datasetAccuracy34.8AS-GCN
Activity RecognitionNTU RGB+DAccuracy (CS)86.8AS-GCN
Activity RecognitionNTU RGB+DAccuracy (CV)94.2AS-GCN
Action LocalizationKinetics-Skeleton datasetAccuracy34.8AS-GCN
Action LocalizationNTU RGB+DAccuracy (CS)86.8AS-GCN
Action LocalizationNTU RGB+DAccuracy (CV)94.2AS-GCN
Action DetectionKinetics-Skeleton datasetAccuracy34.8AS-GCN
Action DetectionNTU RGB+DAccuracy (CS)86.8AS-GCN
Action DetectionNTU RGB+DAccuracy (CV)94.2AS-GCN
3D Action RecognitionKinetics-Skeleton datasetAccuracy34.8AS-GCN
3D Action RecognitionNTU RGB+DAccuracy (CS)86.8AS-GCN
3D Action RecognitionNTU RGB+DAccuracy (CV)94.2AS-GCN
Action RecognitionKinetics-Skeleton datasetAccuracy34.8AS-GCN
Action RecognitionNTU RGB+DAccuracy (CS)86.8AS-GCN
Action RecognitionNTU RGB+DAccuracy (CV)94.2AS-GCN

Related Papers

A Real-Time System for Egocentric Hand-Object Interaction Detection in Industrial Domains2025-07-17DVFL-Net: A Lightweight Distilled Video Focal Modulation Network for Spatio-Temporal Action Recognition2025-07-16Zero-shot Skeleton-based Action Recognition with Prototype-guided Feature Alignment2025-07-01EgoAdapt: Adaptive Multisensory Distillation and Policy Learning for Efficient Egocentric Perception2025-06-26Feature Hallucination for Self-supervised Action Recognition2025-06-25CARMA: Context-Aware Situational Grounding of Human-Robot Group Interactions by Combining Vision-Language Models with Object and Action Recognition2025-06-25Including Semantic Information via Word Embeddings for Skeleton-based Action Recognition2025-06-23Adapting Vision-Language Models for Evaluating World Models2025-06-22