TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/EMDM: Efficient Motion Diffusion Model for Fast and High-Q...

EMDM: Efficient Motion Diffusion Model for Fast and High-Quality Motion Generation

Wenyang Zhou, Zhiyang Dou, Zeyu Cao, Zhouyingcheng Liao, Jingbo Wang, Wenjia Wang, YuAn Liu, Taku Komura, Wenping Wang, Lingjie Liu

2023-12-04DenoisingHuman DynamicsMotion GenerationMotion Synthesis
PaperPDFCode(official)

Abstract

We introduce Efficient Motion Diffusion Model (EMDM) for fast and high-quality human motion generation. Current state-of-the-art generative diffusion models have produced impressive results but struggle to achieve fast generation without sacrificing quality. On the one hand, previous works, like motion latent diffusion, conduct diffusion within a latent space for efficiency, but learning such a latent space can be a non-trivial effort. On the other hand, accelerating generation by naively increasing the sampling step size, e.g., DDIM, often leads to quality degradation as it fails to approximate the complex denoising distribution. To address these issues, we propose EMDM, which captures the complex distribution during multiple sampling steps in the diffusion model, allowing for much fewer sampling steps and significant acceleration in generation. This is achieved by a conditional denoising diffusion GAN to capture multimodal data distributions among arbitrary (and potentially larger) step sizes conditioned on control signals, enabling fewer-step motion sampling with high fidelity and diversity. To minimize undesired motion artifacts, geometric losses are imposed during network learning. As a result, EMDM achieves real-time motion generation and significantly improves the efficiency of motion diffusion models compared to existing methods while achieving high-quality motion generation. Our code will be publicly available upon publication.

Results

TaskDatasetMetricValueModel
Pose TrackingHumanML3DDiversity9.551EMDM
Pose TrackingHumanML3DFID0.112EMDM
Pose TrackingHumanML3DMultimodality1.641EMDM
Pose TrackingHumanML3DR Precision Top30.786EMDM
Pose TrackingKIT Motion-LanguageDiversity10.96EMDM
Pose TrackingKIT Motion-LanguageFID0.261EMDM
Pose TrackingKIT Motion-LanguageMultimodality1.343EMDM
Pose TrackingKIT Motion-LanguageR Precision Top30.78EMDM
Motion SynthesisHumanML3DDiversity9.551EMDM
Motion SynthesisHumanML3DFID0.112EMDM
Motion SynthesisHumanML3DMultimodality1.641EMDM
Motion SynthesisHumanML3DR Precision Top30.786EMDM
Motion SynthesisKIT Motion-LanguageDiversity10.96EMDM
Motion SynthesisKIT Motion-LanguageFID0.261EMDM
Motion SynthesisKIT Motion-LanguageMultimodality1.343EMDM
Motion SynthesisKIT Motion-LanguageR Precision Top30.78EMDM
10-shot image generationHumanML3DDiversity9.551EMDM
10-shot image generationHumanML3DFID0.112EMDM
10-shot image generationHumanML3DMultimodality1.641EMDM
10-shot image generationHumanML3DR Precision Top30.786EMDM
10-shot image generationKIT Motion-LanguageDiversity10.96EMDM
10-shot image generationKIT Motion-LanguageFID0.261EMDM
10-shot image generationKIT Motion-LanguageMultimodality1.343EMDM
10-shot image generationKIT Motion-LanguageR Precision Top30.78EMDM
3D Human Pose TrackingHumanML3DDiversity9.551EMDM
3D Human Pose TrackingHumanML3DFID0.112EMDM
3D Human Pose TrackingHumanML3DMultimodality1.641EMDM
3D Human Pose TrackingHumanML3DR Precision Top30.786EMDM
3D Human Pose TrackingKIT Motion-LanguageDiversity10.96EMDM
3D Human Pose TrackingKIT Motion-LanguageFID0.261EMDM
3D Human Pose TrackingKIT Motion-LanguageMultimodality1.343EMDM
3D Human Pose TrackingKIT Motion-LanguageR Precision Top30.78EMDM

Related Papers

fastWDM3D: Fast and Accurate 3D Healthy Tissue Inpainting2025-07-17Diffuman4D: 4D Consistent Human View Synthesis from Sparse-View Videos with Spatio-Temporal Diffusion Models2025-07-17Similarity-Guided Diffusion for Contrastive Sequential Recommendation2025-07-16HUG-VAS: A Hierarchical NURBS-Based Generative Model for Aortic Geometry Synthesis and Controllable Editing2025-07-15AirLLM: Diffusion Policy-based Adaptive LoRA for Remote Fine-Tuning of LLM over the Air2025-07-15SnapMoGen: Human Motion Generation from Expressive Texts2025-07-12A statistical physics framework for optimal learning2025-07-10Go to Zero: Towards Zero-shot Motion Generation with Million-scale Data2025-07-09