TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Symbiotic Graph Neural Networks for 3D Skeleton-based Huma...

Symbiotic Graph Neural Networks for 3D Skeleton-based Human Action Recognition and Motion Prediction

Maosen Li, Siheng Chen, Xu Chen, Ya zhang, Yan-Feng Wang, Qi Tian

2019-10-05Skeleton Based Action Recognitionmotion predictionAction RecognitionTemporal Action Localization
PaperPDF

Abstract

3D skeleton-based action recognition and motion prediction are two essential problems of human activity understanding. In many previous works: 1) they studied two tasks separately, neglecting internal correlations; 2) they did not capture sufficient relations inside the body. To address these issues, we propose a symbiotic model to handle two tasks jointly; and we propose two scales of graphs to explicitly capture relations among body-joints and body-parts. Together, we propose symbiotic graph neural networks, which contain a backbone, an action-recognition head, and a motion-prediction head. Two heads are trained jointly and enhance each other. For the backbone, we propose multi-branch multi-scale graph convolution networks to extract spatial and temporal features. The multi-scale graph convolution networks are based on joint-scale and part-scale graphs. The joint-scale graphs contain actional graphs, capturing action-based relations, and structural graphs, capturing physical constraints. The part-scale graphs integrate body-joints to form specific parts, representing high-level relations. Moreover, dual bone-based graphs and networks are proposed to learn complementary features. We conduct extensive experiments for skeleton-based action recognition and motion prediction with four datasets, NTU-RGB+D, Kinetics, Human3.6M, and CMU Mocap. Experiments show that our symbiotic graph neural networks achieve better performances on both tasks compared to the state-of-the-art methods.

Results

TaskDatasetMetricValueModel
VideoNTU RGB+DAccuracy (CS)90.1Sym-GNN
VideoNTU RGB+DAccuracy (CV)96.4Sym-GNN
Temporal Action LocalizationNTU RGB+DAccuracy (CS)90.1Sym-GNN
Temporal Action LocalizationNTU RGB+DAccuracy (CV)96.4Sym-GNN
Zero-Shot LearningNTU RGB+DAccuracy (CS)90.1Sym-GNN
Zero-Shot LearningNTU RGB+DAccuracy (CV)96.4Sym-GNN
Activity RecognitionNTU RGB+DAccuracy (CS)90.1Sym-GNN
Activity RecognitionNTU RGB+DAccuracy (CV)96.4Sym-GNN
Action LocalizationNTU RGB+DAccuracy (CS)90.1Sym-GNN
Action LocalizationNTU RGB+DAccuracy (CV)96.4Sym-GNN
Action DetectionNTU RGB+DAccuracy (CS)90.1Sym-GNN
Action DetectionNTU RGB+DAccuracy (CV)96.4Sym-GNN
3D Action RecognitionNTU RGB+DAccuracy (CS)90.1Sym-GNN
3D Action RecognitionNTU RGB+DAccuracy (CV)96.4Sym-GNN
Action RecognitionNTU RGB+DAccuracy (CS)90.1Sym-GNN
Action RecognitionNTU RGB+DAccuracy (CV)96.4Sym-GNN

Related Papers

A Real-Time System for Egocentric Hand-Object Interaction Detection in Industrial Domains2025-07-17DVFL-Net: A Lightweight Distilled Video Focal Modulation Network for Spatio-Temporal Action Recognition2025-07-16Stochastic Human Motion Prediction with Memory of Action Transition and Action Characteristic2025-07-05Temporal Continual Learning with Prior Compensation for Human Motion Prediction2025-07-05Zero-shot Skeleton-based Action Recognition with Prototype-guided Feature Alignment2025-07-01EgoAdapt: Adaptive Multisensory Distillation and Policy Learning for Efficient Egocentric Perception2025-06-26Feature Hallucination for Self-supervised Action Recognition2025-06-25CARMA: Context-Aware Situational Grounding of Human-Robot Group Interactions by Combining Vision-Language Models with Object and Action Recognition2025-06-25