TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Temporal Extension Module for Skeleton-Based Action Recogn...

Temporal Extension Module for Skeleton-Based Action Recognition

Yuya Obinata, Takuma Yamamoto

2020-03-19arXiv 2020 3Skeleton Based Action RecognitionAction Recognition
PaperPDF

Abstract

We present a module that extends the temporal graph of a graph convolutional network (GCN) for action recognition with a sequence of skeletons. Existing methods attempt to represent a more appropriate spatial graph on an intra-frame, but disregard optimization of the temporal graph on the interframe. Concretely, these methods connect between vertices corresponding only to the same joint on the inter-frame. In this work, we focus on adding connections to neighboring multiple vertices on the inter-frame and extracting additional features based on the extended temporal graph. Our module is a simple yet effective method to extract correlated features of multiple joints in human movement. Moreover, our module aids in further performance improvements, along with other GCN methods that optimize only the spatial graph. We conduct extensive experiments on two large datasets, NTU RGB+D and Kinetics-Skeleton, and demonstrate that our module is effective for several existing models and our final model achieves state-of-the-art performance.

Results

TaskDatasetMetricValueModel
VideoKinetics-Skeleton datasetAccuracy38.62s-AGCN+TEM
VideoNTU RGB+DAccuracy (CS)91MS-AAGCN+TEM
VideoNTU RGB+DAccuracy (CV)96.5MS-AAGCN+TEM
Temporal Action LocalizationKinetics-Skeleton datasetAccuracy38.62s-AGCN+TEM
Temporal Action LocalizationNTU RGB+DAccuracy (CS)91MS-AAGCN+TEM
Temporal Action LocalizationNTU RGB+DAccuracy (CV)96.5MS-AAGCN+TEM
Zero-Shot LearningKinetics-Skeleton datasetAccuracy38.62s-AGCN+TEM
Zero-Shot LearningNTU RGB+DAccuracy (CS)91MS-AAGCN+TEM
Zero-Shot LearningNTU RGB+DAccuracy (CV)96.5MS-AAGCN+TEM
Activity RecognitionKinetics-Skeleton datasetAccuracy38.62s-AGCN+TEM
Activity RecognitionNTU RGB+DAccuracy (CS)91MS-AAGCN+TEM
Activity RecognitionNTU RGB+DAccuracy (CV)96.5MS-AAGCN+TEM
Action LocalizationKinetics-Skeleton datasetAccuracy38.62s-AGCN+TEM
Action LocalizationNTU RGB+DAccuracy (CS)91MS-AAGCN+TEM
Action LocalizationNTU RGB+DAccuracy (CV)96.5MS-AAGCN+TEM
Action DetectionKinetics-Skeleton datasetAccuracy38.62s-AGCN+TEM
Action DetectionNTU RGB+DAccuracy (CS)91MS-AAGCN+TEM
Action DetectionNTU RGB+DAccuracy (CV)96.5MS-AAGCN+TEM
3D Action RecognitionKinetics-Skeleton datasetAccuracy38.62s-AGCN+TEM
3D Action RecognitionNTU RGB+DAccuracy (CS)91MS-AAGCN+TEM
3D Action RecognitionNTU RGB+DAccuracy (CV)96.5MS-AAGCN+TEM
Action RecognitionKinetics-Skeleton datasetAccuracy38.62s-AGCN+TEM
Action RecognitionNTU RGB+DAccuracy (CS)91MS-AAGCN+TEM
Action RecognitionNTU RGB+DAccuracy (CV)96.5MS-AAGCN+TEM

Related Papers

A Real-Time System for Egocentric Hand-Object Interaction Detection in Industrial Domains2025-07-17Zero-shot Skeleton-based Action Recognition with Prototype-guided Feature Alignment2025-07-01EgoAdapt: Adaptive Multisensory Distillation and Policy Learning for Efficient Egocentric Perception2025-06-26Feature Hallucination for Self-supervised Action Recognition2025-06-25CARMA: Context-Aware Situational Grounding of Human-Robot Group Interactions by Combining Vision-Language Models with Object and Action Recognition2025-06-25Including Semantic Information via Word Embeddings for Skeleton-based Action Recognition2025-06-23Adapting Vision-Language Models for Evaluating World Models2025-06-22Active Multimodal Distillation for Few-shot Action Recognition2025-06-16