TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Learning Graph Convolutional Network for Skeleton-based Hu...

Learning Graph Convolutional Network for Skeleton-based Human Action Recognition by Neural Searching

Wei Peng, Xiaopeng Hong, Haoyu Chen, Guoying Zhao

2019-11-11Skeleton Based Action RecognitionNeural Architecture SearchAction Recognition
PaperPDFCode

Abstract

Human action recognition from skeleton data, fueled by the Graph Convolutional Network (GCN), has attracted lots of attention, due to its powerful capability of modeling non-Euclidean structure data. However, many existing GCN methods provide a pre-defined graph and fix it through the entire network, which can loss implicit joint correlations. Besides, the mainstream spectral GCN is approximated by one-order hop, thus higher-order connections are not well involved. Therefore, huge efforts are required to explore a better GCN architecture. To address these problems, we turn to Neural Architecture Search (NAS) and propose the first automatically designed GCN for skeleton-based action recognition. Specifically, we enrich the search space by providing multiple dynamic graph modules after fully exploring the spatial-temporal correlations between nodes. Besides, we introduce multiple-hop modules and expect to break the limitation of representational capacity caused by one-order approximation. Moreover, a sampling- and memory-efficient evolution strategy is proposed to search an optimal architecture for this task. The resulted architecture proves the effectiveness of the higher-order approximation and the dynamic graph modeling mechanism with temporal interactions, which is barely discussed before. To evaluate the performance of the searched model, we conduct extensive experiments on two very large scaled datasets and the results show that our model gets the state-of-the-art results.

Results

TaskDatasetMetricValueModel
VideoKinetics-Skeleton datasetAccuracy37.1GCN-NAS
VideoNTU RGB+DAccuracy (CS)89.4GCN-NAS
VideoNTU RGB+DAccuracy (CV)95.7GCN-NAS
Temporal Action LocalizationKinetics-Skeleton datasetAccuracy37.1GCN-NAS
Temporal Action LocalizationNTU RGB+DAccuracy (CS)89.4GCN-NAS
Temporal Action LocalizationNTU RGB+DAccuracy (CV)95.7GCN-NAS
Zero-Shot LearningKinetics-Skeleton datasetAccuracy37.1GCN-NAS
Zero-Shot LearningNTU RGB+DAccuracy (CS)89.4GCN-NAS
Zero-Shot LearningNTU RGB+DAccuracy (CV)95.7GCN-NAS
Activity RecognitionKinetics-Skeleton datasetAccuracy37.1GCN-NAS
Activity RecognitionNTU RGB+DAccuracy (CS)89.4GCN-NAS
Activity RecognitionNTU RGB+DAccuracy (CV)95.7GCN-NAS
Action LocalizationKinetics-Skeleton datasetAccuracy37.1GCN-NAS
Action LocalizationNTU RGB+DAccuracy (CS)89.4GCN-NAS
Action LocalizationNTU RGB+DAccuracy (CV)95.7GCN-NAS
Action DetectionKinetics-Skeleton datasetAccuracy37.1GCN-NAS
Action DetectionNTU RGB+DAccuracy (CS)89.4GCN-NAS
Action DetectionNTU RGB+DAccuracy (CV)95.7GCN-NAS
3D Action RecognitionKinetics-Skeleton datasetAccuracy37.1GCN-NAS
3D Action RecognitionNTU RGB+DAccuracy (CS)89.4GCN-NAS
3D Action RecognitionNTU RGB+DAccuracy (CV)95.7GCN-NAS
Action RecognitionKinetics-Skeleton datasetAccuracy37.1GCN-NAS
Action RecognitionNTU RGB+DAccuracy (CS)89.4GCN-NAS
Action RecognitionNTU RGB+DAccuracy (CV)95.7GCN-NAS

Related Papers

DASViT: Differentiable Architecture Search for Vision Transformer2025-07-17A Real-Time System for Egocentric Hand-Object Interaction Detection in Industrial Domains2025-07-17Zero-shot Skeleton-based Action Recognition with Prototype-guided Feature Alignment2025-07-01EgoAdapt: Adaptive Multisensory Distillation and Policy Learning for Efficient Egocentric Perception2025-06-26Feature Hallucination for Self-supervised Action Recognition2025-06-25CARMA: Context-Aware Situational Grounding of Human-Robot Group Interactions by Combining Vision-Language Models with Object and Action Recognition2025-06-25Including Semantic Information via Word Embeddings for Skeleton-based Action Recognition2025-06-23AnalogNAS-Bench: A NAS Benchmark for Analog In-Memory Computing2025-06-23