TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Structure-Aware Human-Action Generation

Structure-Aware Human-Action Generation

Ping Yu, Yang Zhao, Chunyuan Li, Junsong Yuan, Changyou Chen

2020-07-04ECCV 2020 8Action GenerationHuman action generationgraph constructionVideo Generation
PaperPDFCode(official)

Abstract

Generating long-range skeleton-based human actions has been a challenging problem since small deviations of one frame can cause a malformed action sequence. Most existing methods borrow ideas from video generation, which naively treat skeleton nodes/joints as pixels of images without considering the rich inter-frame and intra-frame structure information, leading to potential distorted actions. Graph convolutional networks (GCNs) is a promising way to leverage structure information to learn structure representations. However, directly adopting GCNs to tackle such continuous action sequences both in spatial and temporal spaces is challenging as the action graph could be huge. To overcome this issue, we propose a variant of GCNs to leverage the powerful self-attention mechanism to adaptively sparsify a complete action graph in the temporal space. Our method could dynamically attend to important past frames and construct a sparse graph to apply in the GCN framework, well-capturing the structure information in action sequences. Extensive experimental results demonstrate the superiority of our method on two standard human action datasets compared with existing methods.

Results

TaskDatasetMetricValueModel
Activity RecognitionNTU RGB+D 2DMMDa (CS)0.285SA-GCN
Activity RecognitionNTU RGB+D 2DMMDa (CV)0.316SA-GCN
Activity RecognitionNTU RGB+D 2DMMDs (CS)0.299SA-GCN
Activity RecognitionNTU RGB+D 2DMMDs (CV)0.335SA-GCN
Activity RecognitionHuman3.6MMMDa0.146SA-GCN
Activity RecognitionHuman3.6MMMDs0.134SA-GCN
Human action generationNTU RGB+D 2DMMDa (CS)0.285SA-GCN
Human action generationNTU RGB+D 2DMMDa (CV)0.316SA-GCN
Human action generationNTU RGB+D 2DMMDs (CS)0.299SA-GCN
Human action generationNTU RGB+D 2DMMDs (CV)0.335SA-GCN
Human action generationHuman3.6MMMDa0.146SA-GCN
Human action generationHuman3.6MMMDs0.134SA-GCN

Related Papers

VITA: Vision-to-Action Flow Matching Policy2025-07-17Efficiently Constructing Sparse Navigable Graphs2025-07-17NGTM: Substructure-based Neural Graph Topic Model for Interpretable Graph Generation2025-07-17World Model-Based End-to-End Scene Generation for Accident Anticipation in Autonomous Driving2025-07-17Leveraging Pre-Trained Visual Models for AI-Generated Video Detection2025-07-17Taming Diffusion Transformer for Real-Time Mobile Video Generation2025-07-17LoViC: Efficient Long Video Generation with Context Compression2025-07-17$I^{2}$-World: Intra-Inter Tokenization for Efficient Dynamic 4D Scene Forecasting2025-07-12