MCM: Multi-condition Motion Synthesis Framework

Zeyu Ling, Bo Han, Yongkang Wongkan, Han Lin, Mohan Kankanhalli, Weidong Geng

2024-04-19Motion Synthesis

Abstract

Conditional human motion synthesis (HMS) aims to generate human motion sequences that conform to specific conditions. Text and audio represent the two predominant modalities employed as HMS control conditions. While existing research has primarily focused on single conditions, the multi-condition human motion synthesis remains underexplored. In this study, we propose a multi-condition HMS framework, termed MCM, based on a dual-branch structure composed of a main branch and a control branch. This framework effectively extends the applicability of the diffusion model, which is initially predicated solely on textual conditions, to auditory conditions. This extension encompasses both music-to-dance and co-speech HMS while preserving the intrinsic quality of motion and the capabilities for semantic association inherent in the original model. Furthermore, we propose the implementation of a Transformer-based diffusion model, designated as MWNet, as the main branch. This model adeptly apprehends the spatial intricacies and inter-joint correlations inherent in motion sequences, facilitated by the integration of multi-wise self-attention modules. Extensive experiments show that our method achieves competitive results in single-condition and multi-condition HMS tasks.

Results

Task	Dataset	Metric	Value	Model
Pose Tracking	HumanML3D	Diversity	9.585	MCM
Pose Tracking	HumanML3D	FID	0.053	MCM
Pose Tracking	HumanML3D	Multimodality	0.8104	MCM
Pose Tracking	HumanML3D	R Precision Top3	0.788	MCM
Motion Synthesis	HumanML3D	Diversity	9.585	MCM
Motion Synthesis	HumanML3D	FID	0.053	MCM
Motion Synthesis	HumanML3D	Multimodality	0.8104	MCM
Motion Synthesis	HumanML3D	R Precision Top3	0.788	MCM
10-shot image generation	HumanML3D	Diversity	9.585	MCM
10-shot image generation	HumanML3D	FID	0.053	MCM
10-shot image generation	HumanML3D	Multimodality	0.8104	MCM
10-shot image generation	HumanML3D	R Precision Top3	0.788	MCM
3D Human Pose Tracking	HumanML3D	Diversity	9.585	MCM
3D Human Pose Tracking	HumanML3D	FID	0.053	MCM
3D Human Pose Tracking	HumanML3D	Multimodality	0.8104	MCM
3D Human Pose Tracking	HumanML3D	R Precision Top3	0.788	MCM

MCM: Multi-condition Motion Synthesis Framework

Abstract

Results

Related Papers

MCM: Multi-condition Motion Synthesis Framework

Abstract

Results

Related Papers