TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Semantics-Guided Neural Networks for Efficient Skeleton-Ba...

Semantics-Guided Neural Networks for Efficient Skeleton-Based Human Action Recognition

Pengfei Zhang, Cuiling Lan, Wen-Jun Zeng, Junliang Xing, Jianru Xue, Nanning Zheng

2019-04-02CVPR 2020 6Skeleton Based Action RecognitionAction RecognitionTemporal Action Localization
PaperPDFCode(official)Code

Abstract

Skeleton-based human action recognition has attracted great interest thanks to the easy accessibility of the human skeleton data. Recently, there is a trend of using very deep feedforward neural networks to model the 3D coordinates of joints without considering the computational efficiency. In this paper, we propose a simple yet effective semantics-guided neural network (SGN) for skeleton-based action recognition. We explicitly introduce the high level semantics of joints (joint type and frame index) into the network to enhance the feature representation capability. In addition, we exploit the relationship of joints hierarchically through two modules, i.e., a joint-level module for modeling the correlations of joints in the same frame and a framelevel module for modeling the dependencies of frames by taking the joints in the same frame as a whole. A strong baseline is proposed to facilitate the study of this field. With an order of magnitude smaller model size than most previous works, SGN achieves the state-of-the-art performance on the NTU60, NTU120, and SYSU datasets. The source code is available at https://github.com/microsoft/SGN.

Results

TaskDatasetMetricValueModel
VideoNTU RGB+DAccuracy (CS)89SGN
VideoNTU RGB+DAccuracy (CV)94.5SGN
Temporal Action LocalizationNTU RGB+DAccuracy (CS)89SGN
Temporal Action LocalizationNTU RGB+DAccuracy (CV)94.5SGN
Zero-Shot LearningNTU RGB+DAccuracy (CS)89SGN
Zero-Shot LearningNTU RGB+DAccuracy (CV)94.5SGN
Activity RecognitionNTU RGB+DAccuracy (CS)89SGN
Activity RecognitionNTU RGB+DAccuracy (CV)94.5SGN
Action LocalizationNTU RGB+DAccuracy (CS)89SGN
Action LocalizationNTU RGB+DAccuracy (CV)94.5SGN
Action DetectionNTU RGB+DAccuracy (CS)89SGN
Action DetectionNTU RGB+DAccuracy (CV)94.5SGN
3D Action RecognitionNTU RGB+DAccuracy (CS)89SGN
3D Action RecognitionNTU RGB+DAccuracy (CV)94.5SGN
Action RecognitionNTU RGB+DAccuracy (CS)89SGN
Action RecognitionNTU RGB+DAccuracy (CV)94.5SGN

Related Papers

A Real-Time System for Egocentric Hand-Object Interaction Detection in Industrial Domains2025-07-17DVFL-Net: A Lightweight Distilled Video Focal Modulation Network for Spatio-Temporal Action Recognition2025-07-16Zero-shot Skeleton-based Action Recognition with Prototype-guided Feature Alignment2025-07-01EgoAdapt: Adaptive Multisensory Distillation and Policy Learning for Efficient Egocentric Perception2025-06-26Feature Hallucination for Self-supervised Action Recognition2025-06-25CARMA: Context-Aware Situational Grounding of Human-Robot Group Interactions by Combining Vision-Language Models with Object and Action Recognition2025-06-25Including Semantic Information via Word Embeddings for Skeleton-based Action Recognition2025-06-23Adapting Vision-Language Models for Evaluating World Models2025-06-22