TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Feedback Graph Convolutional Network for Skeleton-based Ac...

Feedback Graph Convolutional Network for Skeleton-based Action Recognition

Hao Yang, Dan Yan, Li Zhang, Dong Li, YunDa Sun, ShaoDi You, Stephen J. Maybank

2020-03-17Skeleton Based Action RecognitionAction Recognition
PaperPDF

Abstract

Skeleton-based action recognition has attracted considerable attention in computer vision since skeleton data is more robust to the dynamic circumstance and complicated background than other modalities. Recently, many researchers have used the Graph Convolutional Network (GCN) to model spatial-temporal features of skeleton sequences by an end-to-end optimization. However, conventional GCNs are feedforward networks which are impossible for low-level layers to access semantic information in the high-level layers. In this paper, we propose a novel network, named Feedback Graph Convolutional Network (FGCN). This is the first work that introduces the feedback mechanism into GCNs and action recognition. Compared with conventional GCNs, FGCN has the following advantages: (1) a multi-stage temporal sampling strategy is designed to extract spatial-temporal features for action recognition in a coarse-to-fine progressive process; (2) A dense connections based Feedback Graph Convolutional Block (FGCB) is proposed to introduce feedback connections into the GCNs. It transmits the high-level semantic features to the low-level layers and flows temporal information stage by stage to progressively model global spatial-temporal features for action recognition; (3) The FGCN model provides early predictions. In the early stages, the model receives partial information about actions. Naturally, its predictions are relatively coarse. The coarse predictions are treated as the prior to guide the feature learning of later stages for a accurate prediction. Extensive experiments on the datasets, NTU-RGB+D, NTU-RGB+D120 and Northwestern-UCLA, demonstrate that the proposed FGCN is effective for action recognition. It achieves the state-of-the-art performance on the three datasets.

Results

TaskDatasetMetricValueModel
VideoNTU RGB+DAccuracy (CS)90.2FGCN-spatial+FGCN-motion
VideoNTU RGB+DAccuracy (CV)96.3FGCN-spatial+FGCN-motion
Temporal Action LocalizationNTU RGB+DAccuracy (CS)90.2FGCN-spatial+FGCN-motion
Temporal Action LocalizationNTU RGB+DAccuracy (CV)96.3FGCN-spatial+FGCN-motion
Zero-Shot LearningNTU RGB+DAccuracy (CS)90.2FGCN-spatial+FGCN-motion
Zero-Shot LearningNTU RGB+DAccuracy (CV)96.3FGCN-spatial+FGCN-motion
Activity RecognitionNTU RGB+DAccuracy (CS)90.2FGCN-spatial+FGCN-motion
Activity RecognitionNTU RGB+DAccuracy (CV)96.3FGCN-spatial+FGCN-motion
Action LocalizationNTU RGB+DAccuracy (CS)90.2FGCN-spatial+FGCN-motion
Action LocalizationNTU RGB+DAccuracy (CV)96.3FGCN-spatial+FGCN-motion
Action DetectionNTU RGB+DAccuracy (CS)90.2FGCN-spatial+FGCN-motion
Action DetectionNTU RGB+DAccuracy (CV)96.3FGCN-spatial+FGCN-motion
3D Action RecognitionNTU RGB+DAccuracy (CS)90.2FGCN-spatial+FGCN-motion
3D Action RecognitionNTU RGB+DAccuracy (CV)96.3FGCN-spatial+FGCN-motion
Action RecognitionNTU RGB+DAccuracy (CS)90.2FGCN-spatial+FGCN-motion
Action RecognitionNTU RGB+DAccuracy (CV)96.3FGCN-spatial+FGCN-motion

Related Papers

A Real-Time System for Egocentric Hand-Object Interaction Detection in Industrial Domains2025-07-17Zero-shot Skeleton-based Action Recognition with Prototype-guided Feature Alignment2025-07-01EgoAdapt: Adaptive Multisensory Distillation and Policy Learning for Efficient Egocentric Perception2025-06-26Feature Hallucination for Self-supervised Action Recognition2025-06-25CARMA: Context-Aware Situational Grounding of Human-Robot Group Interactions by Combining Vision-Language Models with Object and Action Recognition2025-06-25Including Semantic Information via Word Embeddings for Skeleton-based Action Recognition2025-06-23Adapting Vision-Language Models for Evaluating World Models2025-06-22Active Multimodal Distillation for Few-shot Action Recognition2025-06-16