TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/MCM: Multi-condition Motion Synthesis Framework

MCM: Multi-condition Motion Synthesis Framework

Zeyu Ling, Bo Han, Yongkang Wongkan, Han Lin, Mohan Kankanhalli, Weidong Geng

2024-04-19Motion Synthesis
PaperPDFCode

Abstract

Conditional human motion synthesis (HMS) aims to generate human motion sequences that conform to specific conditions. Text and audio represent the two predominant modalities employed as HMS control conditions. While existing research has primarily focused on single conditions, the multi-condition human motion synthesis remains underexplored. In this study, we propose a multi-condition HMS framework, termed MCM, based on a dual-branch structure composed of a main branch and a control branch. This framework effectively extends the applicability of the diffusion model, which is initially predicated solely on textual conditions, to auditory conditions. This extension encompasses both music-to-dance and co-speech HMS while preserving the intrinsic quality of motion and the capabilities for semantic association inherent in the original model. Furthermore, we propose the implementation of a Transformer-based diffusion model, designated as MWNet, as the main branch. This model adeptly apprehends the spatial intricacies and inter-joint correlations inherent in motion sequences, facilitated by the integration of multi-wise self-attention modules. Extensive experiments show that our method achieves competitive results in single-condition and multi-condition HMS tasks.

Results

TaskDatasetMetricValueModel
Pose TrackingHumanML3DDiversity9.585MCM
Pose TrackingHumanML3DFID0.053MCM
Pose TrackingHumanML3DMultimodality0.8104MCM
Pose TrackingHumanML3DR Precision Top30.788MCM
Motion SynthesisHumanML3DDiversity9.585MCM
Motion SynthesisHumanML3DFID0.053MCM
Motion SynthesisHumanML3DMultimodality0.8104MCM
Motion SynthesisHumanML3DR Precision Top30.788MCM
10-shot image generationHumanML3DDiversity9.585MCM
10-shot image generationHumanML3DFID0.053MCM
10-shot image generationHumanML3DMultimodality0.8104MCM
10-shot image generationHumanML3DR Precision Top30.788MCM
3D Human Pose TrackingHumanML3DDiversity9.585MCM
3D Human Pose TrackingHumanML3DFID0.053MCM
3D Human Pose TrackingHumanML3DMultimodality0.8104MCM
3D Human Pose TrackingHumanML3DR Precision Top30.788MCM

Related Papers

DeepGesture: A conversational gesture synthesis system based on emotions and semantics2025-07-03VolumetricSMPL: A Neural Volumetric Body Model for Efficient Interactions, Contacts, and Collisions2025-06-29DuetGen: Music Driven Two-Person Dance Generation via Hierarchical Masked Modeling2025-06-23PlanMoGPT: Flow-Enhanced Progressive Planning for Text to Motion Synthesis2025-06-22Motion-R1: Chain-of-Thought Reasoning and Reinforcement Learning for Human Motion Generation2025-06-12DanceChat: Large Language Model-Guided Music-to-Dance Generation2025-06-12MotionRAG-Diff: A Retrieval-Augmented Diffusion Framework for Long-Term Music-to-Dance Generation2025-06-03MotionPro: A Precise Motion Controller for Image-to-Video Generation2025-05-26