TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Fg-T2M: Fine-Grained Text-Driven Human Motion Generation v...

Fg-T2M: Fine-Grained Text-Driven Human Motion Generation via Diffusion Model

Yin Wang, Zhiying Leng, Frederick W. B. Li, Shun-Cheng Wu, Xiaohui Liang

2023-09-12ICCV 2023 1Motion GenerationMotion Synthesis
PaperPDF

Abstract

Text-driven human motion generation in computer vision is both significant and challenging. However, current methods are limited to producing either deterministic or imprecise motion sequences, failing to effectively control the temporal and spatial relationships required to conform to a given text description. In this work, we propose a fine-grained method for generating high-quality, conditional human motion sequences supporting precise text description. Our approach consists of two key components: 1) a linguistics-structure assisted module that constructs accurate and complete language feature to fully utilize text information; and 2) a context-aware progressive reasoning module that learns neighborhood and overall semantic linguistics features from shallow and deep graph neural networks to achieve a multi-step inference. Experiments show that our approach outperforms text-driven motion generation methods on HumanML3D and KIT test sets and generates better visually confirmed motion to the text conditions.

Results

TaskDatasetMetricValueModel
Pose TrackingHumanML3DDiversity9.278Fg-T2M
Pose TrackingHumanML3DFID0.243Fg-T2M
Pose TrackingHumanML3DMultimodality1.614Fg-T2M
Pose TrackingHumanML3DR Precision Top30.783Fg-T2M
Pose TrackingKIT Motion-LanguageDiversity10.93Fg-T2M
Pose TrackingKIT Motion-LanguageFID0.571Fg-T2M
Pose TrackingKIT Motion-LanguageMultimodality1.019Fg-T2M
Pose TrackingKIT Motion-LanguageR Precision Top30.745Fg-T2M
Motion SynthesisHumanML3DDiversity9.278Fg-T2M
Motion SynthesisHumanML3DFID0.243Fg-T2M
Motion SynthesisHumanML3DMultimodality1.614Fg-T2M
Motion SynthesisHumanML3DR Precision Top30.783Fg-T2M
Motion SynthesisKIT Motion-LanguageDiversity10.93Fg-T2M
Motion SynthesisKIT Motion-LanguageFID0.571Fg-T2M
Motion SynthesisKIT Motion-LanguageMultimodality1.019Fg-T2M
Motion SynthesisKIT Motion-LanguageR Precision Top30.745Fg-T2M
10-shot image generationHumanML3DDiversity9.278Fg-T2M
10-shot image generationHumanML3DFID0.243Fg-T2M
10-shot image generationHumanML3DMultimodality1.614Fg-T2M
10-shot image generationHumanML3DR Precision Top30.783Fg-T2M
10-shot image generationKIT Motion-LanguageDiversity10.93Fg-T2M
10-shot image generationKIT Motion-LanguageFID0.571Fg-T2M
10-shot image generationKIT Motion-LanguageMultimodality1.019Fg-T2M
10-shot image generationKIT Motion-LanguageR Precision Top30.745Fg-T2M
3D Human Pose TrackingHumanML3DDiversity9.278Fg-T2M
3D Human Pose TrackingHumanML3DFID0.243Fg-T2M
3D Human Pose TrackingHumanML3DMultimodality1.614Fg-T2M
3D Human Pose TrackingHumanML3DR Precision Top30.783Fg-T2M
3D Human Pose TrackingKIT Motion-LanguageDiversity10.93Fg-T2M
3D Human Pose TrackingKIT Motion-LanguageFID0.571Fg-T2M
3D Human Pose TrackingKIT Motion-LanguageMultimodality1.019Fg-T2M
3D Human Pose TrackingKIT Motion-LanguageR Precision Top30.745Fg-T2M

Related Papers

SnapMoGen: Human Motion Generation from Expressive Texts2025-07-12Go to Zero: Towards Zero-shot Motion Generation with Million-scale Data2025-07-09Motion Generation: A Survey of Generative Approaches and Benchmarks2025-07-07DeepGesture: A conversational gesture synthesis system based on emotions and semantics2025-07-03A Unified Transformer-Based Framework with Pretraining For Whole Body Grasping Motion Generation2025-07-01VolumetricSMPL: A Neural Volumetric Body Model for Efficient Interactions, Contacts, and Collisions2025-06-29DuetGen: Music Driven Two-Person Dance Generation via Hierarchical Masked Modeling2025-06-23PlanMoGPT: Flow-Enhanced Progressive Planning for Text to Motion Synthesis2025-06-22