TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/SkeleMotion: A New Representation of Skeleton Joint Sequen...

SkeleMotion: A New Representation of Skeleton Joint Sequences Based on Motion Information for 3D Action Recognition

Carlos Caetano, Jessica Sena, François Brémond, Jefersson A. dos Santos, William Robson Schwartz

2019-07-303D Action RecognitionSkeleton Based Action RecognitionAction RecognitionTemporal Action Localization
PaperPDFCode(official)

Abstract

Due to the availability of large-scale skeleton datasets, 3D human action recognition has recently called the attention of computer vision community. Many works have focused on encoding skeleton data as skeleton image representations based on spatial structure of the skeleton joints, in which the temporal dynamics of the sequence is encoded as variations in columns and the spatial structure of each frame is represented as rows of a matrix. To further improve such representations, we introduce a novel skeleton image representation to be used as input of Convolutional Neural Networks (CNNs), named SkeleMotion. The proposed approach encodes the temporal dynamics by explicitly computing the magnitude and orientation values of the skeleton joints. Different temporal scales are employed to compute motion values to aggregate more temporal dynamics to the representation making it able to capture longrange joint interactions involved in actions as well as filtering noisy motion values. Experimental results demonstrate the effectiveness of the proposed representation on 3D action recognition outperforming the state-of-the-art on NTU RGB+D 120 dataset.

Results

TaskDatasetMetricValueModel
VideoNTU RGB+DAccuracy (CS)76.5Skelemotion + Yang et al.
VideoNTU RGB+DAccuracy (CV)84.7Skelemotion + Yang et al.
Temporal Action LocalizationNTU RGB+DAccuracy (CS)76.5Skelemotion + Yang et al.
Temporal Action LocalizationNTU RGB+DAccuracy (CV)84.7Skelemotion + Yang et al.
Zero-Shot LearningNTU RGB+DAccuracy (CS)76.5Skelemotion + Yang et al.
Zero-Shot LearningNTU RGB+DAccuracy (CV)84.7Skelemotion + Yang et al.
Activity RecognitionNTU RGB+DAccuracy (CS)76.5Skelemotion + Yang et al. (Skeleton only)
Activity RecognitionNTU RGB+DAccuracy (CV)84.7Skelemotion + Yang et al. (Skeleton only)
Activity RecognitionNTU RGB+D 120Accuracy (Cross-Setup)66.9Skelemotion + Yang et al. (skeleton only)
Activity RecognitionNTU RGB+D 120Accuracy (Cross-Subject)67.7Skelemotion + Yang et al. (skeleton only)
Activity RecognitionNTU RGB+DAccuracy (CS)76.5Skelemotion + Yang et al.
Activity RecognitionNTU RGB+DAccuracy (CV)84.7Skelemotion + Yang et al.
Action LocalizationNTU RGB+DAccuracy (CS)76.5Skelemotion + Yang et al.
Action LocalizationNTU RGB+DAccuracy (CV)84.7Skelemotion + Yang et al.
Action DetectionNTU RGB+DAccuracy (CS)76.5Skelemotion + Yang et al.
Action DetectionNTU RGB+DAccuracy (CV)84.7Skelemotion + Yang et al.
3D Action RecognitionNTU RGB+DAccuracy (CS)76.5Skelemotion + Yang et al.
3D Action RecognitionNTU RGB+DAccuracy (CV)84.7Skelemotion + Yang et al.
Action RecognitionNTU RGB+DAccuracy (CS)76.5Skelemotion + Yang et al. (Skeleton only)
Action RecognitionNTU RGB+DAccuracy (CV)84.7Skelemotion + Yang et al. (Skeleton only)
Action RecognitionNTU RGB+D 120Accuracy (Cross-Setup)66.9Skelemotion + Yang et al. (skeleton only)
Action RecognitionNTU RGB+D 120Accuracy (Cross-Subject)67.7Skelemotion + Yang et al. (skeleton only)
Action RecognitionNTU RGB+DAccuracy (CS)76.5Skelemotion + Yang et al.
Action RecognitionNTU RGB+DAccuracy (CV)84.7Skelemotion + Yang et al.

Related Papers

A Real-Time System for Egocentric Hand-Object Interaction Detection in Industrial Domains2025-07-17DVFL-Net: A Lightweight Distilled Video Focal Modulation Network for Spatio-Temporal Action Recognition2025-07-16Zero-shot Skeleton-based Action Recognition with Prototype-guided Feature Alignment2025-07-01EgoAdapt: Adaptive Multisensory Distillation and Policy Learning for Efficient Egocentric Perception2025-06-26Feature Hallucination for Self-supervised Action Recognition2025-06-25CARMA: Context-Aware Situational Grounding of Human-Robot Group Interactions by Combining Vision-Language Models with Object and Action Recognition2025-06-25Including Semantic Information via Word Embeddings for Skeleton-based Action Recognition2025-06-23Adapting Vision-Language Models for Evaluating World Models2025-06-22