TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Human Motion Diffusion as a Generative Prior

Human Motion Diffusion as a Generative Prior

Yonatan Shafir, Guy Tevet, Roy Kapon, Amit H. Bermano

2023-03-02DenoisingMotion Synthesis
PaperPDFCodeCode(official)

Abstract

Recent work has demonstrated the significant potential of denoising diffusion models for generating human motion, including text-to-motion capabilities. However, these methods are restricted by the paucity of annotated motion data, a focus on single-person motions, and a lack of detailed control. In this paper, we introduce three forms of composition based on diffusion priors: sequential, parallel, and model composition. Using sequential composition, we tackle the challenge of long sequence generation. We introduce DoubleTake, an inference-time method with which we generate long animations consisting of sequences of prompted intervals and their transitions, using a prior trained only for short clips. Using parallel composition, we show promising steps toward two-person generation. Beginning with two fixed priors as well as a few two-person training examples, we learn a slim communication block, ComMDM, to coordinate interaction between the two resulting motions. Lastly, using model composition, we first train individual priors to complete motions that realize a prescribed motion for a given joint. We then introduce DiffusionBlending, an interpolation mechanism to effectively blend several such models to enable flexible and efficient fine-grained joint and trajectory-level control and editing. We evaluate the composition methods using an off-the-shelf motion diffusion model, and further compare the results to dedicated models trained for these specific tasks.

Results

TaskDatasetMetricValueModel
Pose TrackingInter-XFID29.266ComMDM
Pose TrackingInter-XMMDist6.87ComMDM
Pose TrackingInter-XMModality0.771ComMDM
Pose TrackingInter-XR-Precision Top30.236ComMDM
Pose TrackingInterHumanFID7.069ComMDM
Pose TrackingInterHumanMMDist6.212ComMDM
Pose TrackingInterHumanMModality1.822ComMDM
Pose TrackingInterHumanR-Precision Top30.466ComMDM
Motion SynthesisInter-XFID29.266ComMDM
Motion SynthesisInter-XMMDist6.87ComMDM
Motion SynthesisInter-XMModality0.771ComMDM
Motion SynthesisInter-XR-Precision Top30.236ComMDM
Motion SynthesisInterHumanFID7.069ComMDM
Motion SynthesisInterHumanMMDist6.212ComMDM
Motion SynthesisInterHumanMModality1.822ComMDM
Motion SynthesisInterHumanR-Precision Top30.466ComMDM
10-shot image generationInter-XFID29.266ComMDM
10-shot image generationInter-XMMDist6.87ComMDM
10-shot image generationInter-XMModality0.771ComMDM
10-shot image generationInter-XR-Precision Top30.236ComMDM
10-shot image generationInterHumanFID7.069ComMDM
10-shot image generationInterHumanMMDist6.212ComMDM
10-shot image generationInterHumanMModality1.822ComMDM
10-shot image generationInterHumanR-Precision Top30.466ComMDM
3D Human Pose TrackingInter-XFID29.266ComMDM
3D Human Pose TrackingInter-XMMDist6.87ComMDM
3D Human Pose TrackingInter-XMModality0.771ComMDM
3D Human Pose TrackingInter-XR-Precision Top30.236ComMDM
3D Human Pose TrackingInterHumanFID7.069ComMDM
3D Human Pose TrackingInterHumanMMDist6.212ComMDM
3D Human Pose TrackingInterHumanMModality1.822ComMDM
3D Human Pose TrackingInterHumanR-Precision Top30.466ComMDM

Related Papers

fastWDM3D: Fast and Accurate 3D Healthy Tissue Inpainting2025-07-17Diffuman4D: 4D Consistent Human View Synthesis from Sparse-View Videos with Spatio-Temporal Diffusion Models2025-07-17Similarity-Guided Diffusion for Contrastive Sequential Recommendation2025-07-16HUG-VAS: A Hierarchical NURBS-Based Generative Model for Aortic Geometry Synthesis and Controllable Editing2025-07-15AirLLM: Diffusion Policy-based Adaptive LoRA for Remote Fine-Tuning of LLM over the Air2025-07-15A statistical physics framework for optimal learning2025-07-10LangMamba: A Language-driven Mamba Framework for Low-dose CT Denoising with Vision-language Models2025-07-08Unconditional Diffusion for Generative Sequential Recommendation2025-07-08