TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/TSGCNeXt: Dynamic-Static Multi-Graph Convolution for Effic...

TSGCNeXt: Dynamic-Static Multi-Graph Convolution for Efficient Skeleton-Based Action Recognition with Long-term Learning Potential

Dongjingdin Liu, Pengpeng Chen, Miao Yao, Yijing Lu, Zijie Cai, Yuxin Tian

2023-04-23Skeleton Based Action RecognitionGraph LearningAction RecognitionTime SeriesTemporal Action Localization
PaperPDFCode(official)

Abstract

Skeleton-based action recognition has achieved remarkable results in human action recognition with the development of graph convolutional networks (GCNs). However, the recent works tend to construct complex learning mechanisms with redundant training and exist a bottleneck for long time-series. To solve these problems, we propose the Temporal-Spatio Graph ConvNeXt (TSGCNeXt) to explore efficient learning mechanism of long temporal skeleton sequences. Firstly, a new graph learning mechanism with simple structure, Dynamic-Static Separate Multi-graph Convolution (DS-SMG) is proposed to aggregate features of multiple independent topological graphs and avoid the node information being ignored during dynamic convolution. Next, we construct a graph convolution training acceleration mechanism to optimize the back-propagation computing of dynamic graph learning with 55.08\% speed-up. Finally, the TSGCNeXt restructure the overall structure of GCN with three Spatio-temporal learning modules,efficiently modeling long temporal features. In comparison with existing previous methods on large-scale datasets NTU RGB+D 60 and 120, TSGCNeXt outperforms on single-stream networks. In addition, with the ema model introduced into the multi-stream fusion, TSGCNeXt achieves SOTA levels. On the cross-subject and cross-set of the NTU 120, accuracies reach 90.22% and 91.74%.

Results

TaskDatasetMetricValueModel
VideoNTU RGB+D 120Accuracy (Cross-Setup)91.7TSGCNeXt
VideoNTU RGB+D 120Accuracy (Cross-Subject)90.2TSGCNeXt
VideoNTU RGB+D 120Ensembled Modalities4TSGCNeXt
VideoNTU RGB+D 120Accuracy (Cross-Setup)90.3TSGCNeXT
VideoNTU RGB+D 120Accuracy (Cross-Subject)89.1TSGCNeXT
VideoNTU RGB+D 120Ensembled Modalities4TSGCNeXT
Temporal Action LocalizationNTU RGB+D 120Accuracy (Cross-Setup)91.7TSGCNeXt
Temporal Action LocalizationNTU RGB+D 120Accuracy (Cross-Subject)90.2TSGCNeXt
Temporal Action LocalizationNTU RGB+D 120Ensembled Modalities4TSGCNeXt
Temporal Action LocalizationNTU RGB+D 120Accuracy (Cross-Setup)90.3TSGCNeXT
Temporal Action LocalizationNTU RGB+D 120Accuracy (Cross-Subject)89.1TSGCNeXT
Temporal Action LocalizationNTU RGB+D 120Ensembled Modalities4TSGCNeXT
Zero-Shot LearningNTU RGB+D 120Accuracy (Cross-Setup)91.7TSGCNeXt
Zero-Shot LearningNTU RGB+D 120Accuracy (Cross-Subject)90.2TSGCNeXt
Zero-Shot LearningNTU RGB+D 120Ensembled Modalities4TSGCNeXt
Zero-Shot LearningNTU RGB+D 120Accuracy (Cross-Setup)90.3TSGCNeXT
Zero-Shot LearningNTU RGB+D 120Accuracy (Cross-Subject)89.1TSGCNeXT
Zero-Shot LearningNTU RGB+D 120Ensembled Modalities4TSGCNeXT
Activity RecognitionNTU RGB+D 120Accuracy (Cross-Setup)91.7TSGCNeXt
Activity RecognitionNTU RGB+D 120Accuracy (Cross-Subject)90.2TSGCNeXt
Activity RecognitionNTU RGB+D 120Ensembled Modalities4TSGCNeXt
Activity RecognitionNTU RGB+D 120Accuracy (Cross-Setup)90.3TSGCNeXT
Activity RecognitionNTU RGB+D 120Accuracy (Cross-Subject)89.1TSGCNeXT
Activity RecognitionNTU RGB+D 120Ensembled Modalities4TSGCNeXT
Action LocalizationNTU RGB+D 120Accuracy (Cross-Setup)91.7TSGCNeXt
Action LocalizationNTU RGB+D 120Accuracy (Cross-Subject)90.2TSGCNeXt
Action LocalizationNTU RGB+D 120Ensembled Modalities4TSGCNeXt
Action LocalizationNTU RGB+D 120Accuracy (Cross-Setup)90.3TSGCNeXT
Action LocalizationNTU RGB+D 120Accuracy (Cross-Subject)89.1TSGCNeXT
Action LocalizationNTU RGB+D 120Ensembled Modalities4TSGCNeXT
Action DetectionNTU RGB+D 120Accuracy (Cross-Setup)91.7TSGCNeXt
Action DetectionNTU RGB+D 120Accuracy (Cross-Subject)90.2TSGCNeXt
Action DetectionNTU RGB+D 120Ensembled Modalities4TSGCNeXt
Action DetectionNTU RGB+D 120Accuracy (Cross-Setup)90.3TSGCNeXT
Action DetectionNTU RGB+D 120Accuracy (Cross-Subject)89.1TSGCNeXT
Action DetectionNTU RGB+D 120Ensembled Modalities4TSGCNeXT
3D Action RecognitionNTU RGB+D 120Accuracy (Cross-Setup)91.7TSGCNeXt
3D Action RecognitionNTU RGB+D 120Accuracy (Cross-Subject)90.2TSGCNeXt
3D Action RecognitionNTU RGB+D 120Ensembled Modalities4TSGCNeXt
3D Action RecognitionNTU RGB+D 120Accuracy (Cross-Setup)90.3TSGCNeXT
3D Action RecognitionNTU RGB+D 120Accuracy (Cross-Subject)89.1TSGCNeXT
3D Action RecognitionNTU RGB+D 120Ensembled Modalities4TSGCNeXT
Action RecognitionNTU RGB+D 120Accuracy (Cross-Setup)91.7TSGCNeXt
Action RecognitionNTU RGB+D 120Accuracy (Cross-Subject)90.2TSGCNeXt
Action RecognitionNTU RGB+D 120Ensembled Modalities4TSGCNeXt
Action RecognitionNTU RGB+D 120Accuracy (Cross-Setup)90.3TSGCNeXT
Action RecognitionNTU RGB+D 120Accuracy (Cross-Subject)89.1TSGCNeXT
Action RecognitionNTU RGB+D 120Ensembled Modalities4TSGCNeXT

Related Papers

SGCL: Unifying Self-Supervised and Supervised Learning for Graph Recommendation2025-07-17A Real-Time System for Egocentric Hand-Object Interaction Detection in Industrial Domains2025-07-17MoTM: Towards a Foundation Model for Time Series Imputation based on Continuous Modeling2025-07-17The Power of Architecture: Deep Dive into Transformer Architectures for Long-Term Time Series Forecasting2025-07-17DVFL-Net: A Lightweight Distilled Video Focal Modulation Network for Spatio-Temporal Action Recognition2025-07-16A Graph-in-Graph Learning Framework for Drug-Target Interaction Prediction2025-07-15Data Augmentation in Time Series Forecasting through Inverted Framework2025-07-15D3FL: Data Distribution and Detrending for Robust Federated Learning in Non-linear Time-series Data2025-07-15