TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/HDBN: A Novel Hybrid Dual-branch Network for Robust Skelet...

HDBN: A Novel Hybrid Dual-branch Network for Robust Skeleton-based Action Recognition

Jinfu Liu, Baiqiao Yin, Jiaying Lin, Jiajun Wen, Yue Li, Mengyuan Liu

2024-04-24Skeleton Based Action RecognitionAction Recognition
PaperPDFCode(official)

Abstract

Skeleton-based action recognition has gained considerable traction thanks to its utilization of succinct and robust skeletal representations. Nonetheless, current methodologies often lean towards utilizing a solitary backbone to model skeleton modality, which can be limited by inherent flaws in the network backbone. To address this and fully leverage the complementary characteristics of various network architectures, we propose a novel Hybrid Dual-Branch Network (HDBN) for robust skeleton-based action recognition, which benefits from the graph convolutional network's proficiency in handling graph-structured data and the powerful modeling capabilities of Transformers for global information. In detail, our proposed HDBN is divided into two trunk branches: MixGCN and MixFormer. The two branches utilize GCNs and Transformers to model both 2D and 3D skeletal modalities respectively. Our proposed HDBN emerged as one of the top solutions in the Multi-Modal Video Reasoning and Analyzing Competition (MMVRAC) of 2024 ICME Grand Challenge, achieving accuracies of 47.95% and 75.36% on two benchmarks of the UAV-Human dataset by outperforming most existing methods. Our code will be publicly available at: https://github.com/liujf69/ICMEW2024-Track10.

Results

TaskDatasetMetricValueModel
VideoUAV-HumanCSv1(%)47.96HDBN
VideoUAV-HumanCSv2(%)75.36HDBN
Temporal Action LocalizationUAV-HumanCSv1(%)47.96HDBN
Temporal Action LocalizationUAV-HumanCSv2(%)75.36HDBN
Zero-Shot LearningUAV-HumanCSv1(%)47.96HDBN
Zero-Shot LearningUAV-HumanCSv2(%)75.36HDBN
Activity RecognitionUAV-HumanCSv1(%)47.96HDBN
Activity RecognitionUAV-HumanCSv2(%)75.36HDBN
Action LocalizationUAV-HumanCSv1(%)47.96HDBN
Action LocalizationUAV-HumanCSv2(%)75.36HDBN
Action DetectionUAV-HumanCSv1(%)47.96HDBN
Action DetectionUAV-HumanCSv2(%)75.36HDBN
3D Action RecognitionUAV-HumanCSv1(%)47.96HDBN
3D Action RecognitionUAV-HumanCSv2(%)75.36HDBN
Action RecognitionUAV-HumanCSv1(%)47.96HDBN
Action RecognitionUAV-HumanCSv2(%)75.36HDBN

Related Papers

A Real-Time System for Egocentric Hand-Object Interaction Detection in Industrial Domains2025-07-17Zero-shot Skeleton-based Action Recognition with Prototype-guided Feature Alignment2025-07-01EgoAdapt: Adaptive Multisensory Distillation and Policy Learning for Efficient Egocentric Perception2025-06-26Feature Hallucination for Self-supervised Action Recognition2025-06-25CARMA: Context-Aware Situational Grounding of Human-Robot Group Interactions by Combining Vision-Language Models with Object and Action Recognition2025-06-25Including Semantic Information via Word Embeddings for Skeleton-based Action Recognition2025-06-23Adapting Vision-Language Models for Evaluating World Models2025-06-22Active Multimodal Distillation for Few-shot Action Recognition2025-06-16