TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Channel-wise Topology Refinement Graph Convolution for Ske...

Channel-wise Topology Refinement Graph Convolution for Skeleton-Based Action Recognition

Yuxin Chen, Ziqi Zhang, Chunfeng Yuan, Bing Li, Ying Deng, Weiming Hu

2021-07-26ICCV 2021 10Skeleton Based Action RecognitionAction Recognition
PaperPDFCode(official)Code

Abstract

Graph convolutional networks (GCNs) have been widely used and achieved remarkable results in skeleton-based action recognition. In GCNs, graph topology dominates feature aggregation and therefore is the key to extracting representative features. In this work, we propose a novel Channel-wise Topology Refinement Graph Convolution (CTR-GC) to dynamically learn different topologies and effectively aggregate joint features in different channels for skeleton-based action recognition. The proposed CTR-GC models channel-wise topologies through learning a shared topology as a generic prior for all channels and refining it with channel-specific correlations for each channel. Our refinement method introduces few extra parameters and significantly reduces the difficulty of modeling channel-wise topologies. Furthermore, via reformulating graph convolutions into a unified form, we find that CTR-GC relaxes strict constraints of graph convolutions, leading to stronger representation capability. Combining CTR-GC with temporal modeling modules, we develop a powerful graph convolutional network named CTR-GCN which notably outperforms state-of-the-art methods on the NTU RGB+D, NTU RGB+D 120, and NW-UCLA datasets.

Results

TaskDatasetMetricValueModel
VideoNTU RGB+D 120Accuracy (Cross-Setup)90.6CTR-GCN
VideoNTU RGB+D 120Accuracy (Cross-Subject)88.9CTR-GCN
VideoNTU RGB+D 120Ensembled Modalities4CTR-GCN
VideoN-UCLAAccuracy96.5CTR-GCN
VideoNTU RGB+DAccuracy (CS)92.4CTR-GCN
VideoNTU RGB+DAccuracy (CV)96.8CTR-GCN
VideoNTU RGB+DEnsembled Modalities4CTR-GCN
Temporal Action LocalizationNTU RGB+D 120Accuracy (Cross-Setup)90.6CTR-GCN
Temporal Action LocalizationNTU RGB+D 120Accuracy (Cross-Subject)88.9CTR-GCN
Temporal Action LocalizationNTU RGB+D 120Ensembled Modalities4CTR-GCN
Temporal Action LocalizationN-UCLAAccuracy96.5CTR-GCN
Temporal Action LocalizationNTU RGB+DAccuracy (CS)92.4CTR-GCN
Temporal Action LocalizationNTU RGB+DAccuracy (CV)96.8CTR-GCN
Temporal Action LocalizationNTU RGB+DEnsembled Modalities4CTR-GCN
Zero-Shot LearningNTU RGB+D 120Accuracy (Cross-Setup)90.6CTR-GCN
Zero-Shot LearningNTU RGB+D 120Accuracy (Cross-Subject)88.9CTR-GCN
Zero-Shot LearningNTU RGB+D 120Ensembled Modalities4CTR-GCN
Zero-Shot LearningN-UCLAAccuracy96.5CTR-GCN
Zero-Shot LearningNTU RGB+DAccuracy (CS)92.4CTR-GCN
Zero-Shot LearningNTU RGB+DAccuracy (CV)96.8CTR-GCN
Zero-Shot LearningNTU RGB+DEnsembled Modalities4CTR-GCN
Activity RecognitionNTU RGB+D 120Accuracy (Cross-Setup)90.6CTR-GCN
Activity RecognitionNTU RGB+D 120Accuracy (Cross-Subject)88.9CTR-GCN
Activity RecognitionNTU RGB+D 120Ensembled Modalities4CTR-GCN
Activity RecognitionN-UCLAAccuracy96.5CTR-GCN
Activity RecognitionNTU RGB+DAccuracy (CS)92.4CTR-GCN
Activity RecognitionNTU RGB+DAccuracy (CV)96.8CTR-GCN
Activity RecognitionNTU RGB+DEnsembled Modalities4CTR-GCN
Action LocalizationNTU RGB+D 120Accuracy (Cross-Setup)90.6CTR-GCN
Action LocalizationNTU RGB+D 120Accuracy (Cross-Subject)88.9CTR-GCN
Action LocalizationNTU RGB+D 120Ensembled Modalities4CTR-GCN
Action LocalizationN-UCLAAccuracy96.5CTR-GCN
Action LocalizationNTU RGB+DAccuracy (CS)92.4CTR-GCN
Action LocalizationNTU RGB+DAccuracy (CV)96.8CTR-GCN
Action LocalizationNTU RGB+DEnsembled Modalities4CTR-GCN
Action DetectionNTU RGB+D 120Accuracy (Cross-Setup)90.6CTR-GCN
Action DetectionNTU RGB+D 120Accuracy (Cross-Subject)88.9CTR-GCN
Action DetectionNTU RGB+D 120Ensembled Modalities4CTR-GCN
Action DetectionN-UCLAAccuracy96.5CTR-GCN
Action DetectionNTU RGB+DAccuracy (CS)92.4CTR-GCN
Action DetectionNTU RGB+DAccuracy (CV)96.8CTR-GCN
Action DetectionNTU RGB+DEnsembled Modalities4CTR-GCN
3D Action RecognitionNTU RGB+D 120Accuracy (Cross-Setup)90.6CTR-GCN
3D Action RecognitionNTU RGB+D 120Accuracy (Cross-Subject)88.9CTR-GCN
3D Action RecognitionNTU RGB+D 120Ensembled Modalities4CTR-GCN
3D Action RecognitionN-UCLAAccuracy96.5CTR-GCN
3D Action RecognitionNTU RGB+DAccuracy (CS)92.4CTR-GCN
3D Action RecognitionNTU RGB+DAccuracy (CV)96.8CTR-GCN
3D Action RecognitionNTU RGB+DEnsembled Modalities4CTR-GCN
Action RecognitionNTU RGB+D 120Accuracy (Cross-Setup)90.6CTR-GCN
Action RecognitionNTU RGB+D 120Accuracy (Cross-Subject)88.9CTR-GCN
Action RecognitionNTU RGB+D 120Ensembled Modalities4CTR-GCN
Action RecognitionN-UCLAAccuracy96.5CTR-GCN
Action RecognitionNTU RGB+DAccuracy (CS)92.4CTR-GCN
Action RecognitionNTU RGB+DAccuracy (CV)96.8CTR-GCN
Action RecognitionNTU RGB+DEnsembled Modalities4CTR-GCN

Related Papers

A Real-Time System for Egocentric Hand-Object Interaction Detection in Industrial Domains2025-07-17Zero-shot Skeleton-based Action Recognition with Prototype-guided Feature Alignment2025-07-01EgoAdapt: Adaptive Multisensory Distillation and Policy Learning for Efficient Egocentric Perception2025-06-26Feature Hallucination for Self-supervised Action Recognition2025-06-25CARMA: Context-Aware Situational Grounding of Human-Robot Group Interactions by Combining Vision-Language Models with Object and Action Recognition2025-06-25Including Semantic Information via Word Embeddings for Skeleton-based Action Recognition2025-06-23Adapting Vision-Language Models for Evaluating World Models2025-06-22Active Multimodal Distillation for Few-shot Action Recognition2025-06-16