TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/IGFormer: Interaction Graph Transformer for Skeleton-based...

IGFormer: Interaction Graph Transformer for Skeleton-based Human Interaction Recognition

Yunsheng Pang, Qiuhong Ke, Hossein Rahmani, James Bailey, Jun Liu

2022-07-25Human Interaction Recognition
PaperPDF

Abstract

Human interaction recognition is very important in many applications. One crucial cue in recognizing an interaction is the interactive body parts. In this work, we propose a novel Interaction Graph Transformer (IGFormer) network for skeleton-based interaction recognition via modeling the interactive body parts as graphs. More specifically, the proposed IGFormer constructs interaction graphs according to the semantic and distance correlations between the interactive body parts, and enhances the representation of each person by aggregating the information of the interactive body parts based on the learned graphs. Furthermore, we propose a Semantic Partition Module to transform each human skeleton sequence into a Body-Part-Time sequence to better capture the spatial and temporal information of the skeleton sequence for learning the graphs. Extensive experiments on three benchmark datasets demonstrate that our model outperforms the state-of-the-art with a significant margin.

Results

TaskDatasetMetricValueModel
Human Interaction RecognitionNTU RGB+DAccuracy (Cross-Subject)93.6IGFormer
Human Interaction RecognitionNTU RGB+DAccuracy (Cross-View)96.5IGFormer
Human Interaction RecognitionSBU / SBU-RefineAccuracy98.4IGFormer
Human Interaction RecognitionNTU RGB+D 120Accuracy (Cross-Setup)86.5IGFormer
Human Interaction RecognitionNTU RGB+D 120Accuracy (Cross-Subject)85.4IGFormer

Related Papers

Dynamic Scene Understanding from Vision-Language Representations2025-01-20OV-HHIR: Open Vocabulary Human Interaction Recognition Using Cross-modal Integration of Large Language Models2024-12-31CHASE: Learning Convex Hull Adaptive Shift for Skeleton-based Multi-Entity Action Recognition2024-10-09Empathic Grounding: Explorations using Multimodal Interaction and Large Language Models with Conversational Agents2024-07-01Exploring Vision Transformers for 3D Human Motion-Language Models with Motion Patches2024-05-08SkateFormer: Skeletal-Temporal Transformer for Human Action Recognition2024-03-14Learning Mutual Excitation for Hand-to-Hand and Human-to-Human Interaction Recognition2024-02-04A Two-stream Hybrid CNN-Transformer Network for Skeleton-based Human Interaction Recognition2023-12-31