TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Fusing Higher-order Features in Graph Neural Networks for ...

Fusing Higher-order Features in Graph Neural Networks for Skeleton-based Action Recognition

Zhenyue Qin, Yang Liu, Pan Ji, Dongwoo Kim, Lei Wang, Bob McKay, Saeed Anwar, Tom Gedeon

2021-05-04Skeleton Based Action RecognitionAction Recognition
PaperPDFCode(official)

Abstract

Skeleton sequences are lightweight and compact, and thus are ideal candidates for action recognition on edge devices. Recent skeleton-based action recognition methods extract features from 3D joint coordinates as spatial-temporal cues, using these representations in a graph neural network for feature fusion to boost recognition performance. The use of first- and second-order features, i.e., joint and bone representations, has led to high accuracy. Nonetheless, many models are still confused by actions that have similar motion trajectories. To address these issues, we propose fusing higher-order features in the form of angular encoding into modern architectures to robustly capture the relationships between joints and body parts. This simple fusion with popular spatial-temporal graph neural networks achieves new state-of-the-art accuracy in two large benchmarks, including NTU60 and NTU120, while employing fewer parameters and reduced run time. Our source code is publicly available at: https://github.com/ZhenyueQin/Angular-Skeleton-Encoding.

Results

TaskDatasetMetricValueModel
VideoNTU RGB+D 120Ensembled Modalities4AngNet-JA + BA + JBA + VJBA
VideoNTU RGB+DAccuracy (CS)91.7AngNet-JA + BA + JBA + VJBA
VideoNTU RGB+DAccuracy (CV)96.4AngNet-JA + BA + JBA + VJBA
Temporal Action LocalizationNTU RGB+D 120Ensembled Modalities4AngNet-JA + BA + JBA + VJBA
Temporal Action LocalizationNTU RGB+DAccuracy (CS)91.7AngNet-JA + BA + JBA + VJBA
Temporal Action LocalizationNTU RGB+DAccuracy (CV)96.4AngNet-JA + BA + JBA + VJBA
Zero-Shot LearningNTU RGB+D 120Ensembled Modalities4AngNet-JA + BA + JBA + VJBA
Zero-Shot LearningNTU RGB+DAccuracy (CS)91.7AngNet-JA + BA + JBA + VJBA
Zero-Shot LearningNTU RGB+DAccuracy (CV)96.4AngNet-JA + BA + JBA + VJBA
Activity RecognitionNTU RGB+D 120Ensembled Modalities4AngNet-JA + BA + JBA + VJBA
Activity RecognitionNTU RGB+DAccuracy (CS)91.7AngNet-JA + BA + JBA + VJBA
Activity RecognitionNTU RGB+DAccuracy (CV)96.4AngNet-JA + BA + JBA + VJBA
Action LocalizationNTU RGB+D 120Ensembled Modalities4AngNet-JA + BA + JBA + VJBA
Action LocalizationNTU RGB+DAccuracy (CS)91.7AngNet-JA + BA + JBA + VJBA
Action LocalizationNTU RGB+DAccuracy (CV)96.4AngNet-JA + BA + JBA + VJBA
Action DetectionNTU RGB+D 120Ensembled Modalities4AngNet-JA + BA + JBA + VJBA
Action DetectionNTU RGB+DAccuracy (CS)91.7AngNet-JA + BA + JBA + VJBA
Action DetectionNTU RGB+DAccuracy (CV)96.4AngNet-JA + BA + JBA + VJBA
3D Action RecognitionNTU RGB+D 120Ensembled Modalities4AngNet-JA + BA + JBA + VJBA
3D Action RecognitionNTU RGB+DAccuracy (CS)91.7AngNet-JA + BA + JBA + VJBA
3D Action RecognitionNTU RGB+DAccuracy (CV)96.4AngNet-JA + BA + JBA + VJBA
Action RecognitionNTU RGB+D 120Ensembled Modalities4AngNet-JA + BA + JBA + VJBA
Action RecognitionNTU RGB+DAccuracy (CS)91.7AngNet-JA + BA + JBA + VJBA
Action RecognitionNTU RGB+DAccuracy (CV)96.4AngNet-JA + BA + JBA + VJBA

Related Papers

A Real-Time System for Egocentric Hand-Object Interaction Detection in Industrial Domains2025-07-17Zero-shot Skeleton-based Action Recognition with Prototype-guided Feature Alignment2025-07-01EgoAdapt: Adaptive Multisensory Distillation and Policy Learning for Efficient Egocentric Perception2025-06-26Feature Hallucination for Self-supervised Action Recognition2025-06-25CARMA: Context-Aware Situational Grounding of Human-Robot Group Interactions by Combining Vision-Language Models with Object and Action Recognition2025-06-25Including Semantic Information via Word Embeddings for Skeleton-based Action Recognition2025-06-23Adapting Vision-Language Models for Evaluating World Models2025-06-22Active Multimodal Distillation for Few-shot Action Recognition2025-06-16