TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/A New Representation of Skeleton Sequences for 3D Action R...

A New Representation of Skeleton Sequences for 3D Action Recognition

Qiuhong Ke, Mohammed Bennamoun, Senjian An, Ferdous Sohel, Farid Boussaid

2017-03-09CVPR 2017 73D Action RecognitionSkeleton Based Action RecognitionMulti-Task LearningAction RecognitionTemporal Action Localization
PaperPDF

Abstract

This paper presents a new method for 3D action recognition with skeleton sequences (i.e., 3D trajectories of human skeleton joints). The proposed method first transforms each skeleton sequence into three clips each consisting of several frames for spatial temporal feature learning using deep neural networks. Each clip is generated from one channel of the cylindrical coordinates of the skeleton sequence. Each frame of the generated clips represents the temporal information of the entire skeleton sequence, and incorporates one particular spatial relationship between the joints. The entire clips include multiple frames with different spatial relationships, which provide useful spatial structural information of the human skeleton. We propose to use deep convolutional neural networks to learn long-term temporal information of the skeleton sequence from the frames of the generated clips, and then use a Multi-Task Learning Network (MTLN) to jointly process all frames of the generated clips in parallel to incorporate spatial structural information for action recognition. Experimental results clearly show the effectiveness of the proposed new representation and feature learning method for 3D action recognition.

Results

TaskDatasetMetricValueModel
VideoNTU RGB+DAccuracy (CS)79.6Clips+CNN+MTLN
VideoNTU RGB+DAccuracy (CV)84.8Clips+CNN+MTLN
Temporal Action LocalizationNTU RGB+DAccuracy (CS)79.6Clips+CNN+MTLN
Temporal Action LocalizationNTU RGB+DAccuracy (CV)84.8Clips+CNN+MTLN
Zero-Shot LearningNTU RGB+DAccuracy (CS)79.6Clips+CNN+MTLN
Zero-Shot LearningNTU RGB+DAccuracy (CV)84.8Clips+CNN+MTLN
Activity RecognitionNTU RGB+DAccuracy (CS)79.6Clips+CNN+MTLN
Activity RecognitionNTU RGB+DAccuracy (CV)84.8Clips+CNN+MTLN
Action LocalizationNTU RGB+DAccuracy (CS)79.6Clips+CNN+MTLN
Action LocalizationNTU RGB+DAccuracy (CV)84.8Clips+CNN+MTLN
Action DetectionNTU RGB+DAccuracy (CS)79.6Clips+CNN+MTLN
Action DetectionNTU RGB+DAccuracy (CV)84.8Clips+CNN+MTLN
3D Action RecognitionNTU RGB+DAccuracy (CS)79.6Clips+CNN+MTLN
3D Action RecognitionNTU RGB+DAccuracy (CV)84.8Clips+CNN+MTLN
Action RecognitionNTU RGB+DAccuracy (CS)79.6Clips+CNN+MTLN
Action RecognitionNTU RGB+DAccuracy (CV)84.8Clips+CNN+MTLN

Related Papers

SGCL: Unifying Self-Supervised and Supervised Learning for Graph Recommendation2025-07-17A Real-Time System for Egocentric Hand-Object Interaction Detection in Industrial Domains2025-07-17DVFL-Net: A Lightweight Distilled Video Focal Modulation Network for Spatio-Temporal Action Recognition2025-07-16Robust-Multi-Task Gradient Boosting2025-07-15SAMO: A Lightweight Sharpness-Aware Approach for Multi-Task Optimization with Joint Global-Local Perturbation2025-07-10Zero-shot Skeleton-based Action Recognition with Prototype-guided Feature Alignment2025-07-01EgoAdapt: Adaptive Multisensory Distillation and Policy Learning for Efficient Egocentric Perception2025-06-26Opportunistic Osteoporosis Diagnosis via Texture-Preserving Self-Supervision, Mixture of Experts and Multi-Task Integration2025-06-25