Human Motion Diffusion as a Generative Prior

Yonatan Shafir, Guy Tevet, Roy Kapon, Amit H. Bermano

2023-03-02Denoising Motion Synthesis

Abstract

Recent work has demonstrated the significant potential of denoising diffusion models for generating human motion, including text-to-motion capabilities. However, these methods are restricted by the paucity of annotated motion data, a focus on single-person motions, and a lack of detailed control. In this paper, we introduce three forms of composition based on diffusion priors: sequential, parallel, and model composition. Using sequential composition, we tackle the challenge of long sequence generation. We introduce DoubleTake, an inference-time method with which we generate long animations consisting of sequences of prompted intervals and their transitions, using a prior trained only for short clips. Using parallel composition, we show promising steps toward two-person generation. Beginning with two fixed priors as well as a few two-person training examples, we learn a slim communication block, ComMDM, to coordinate interaction between the two resulting motions. Lastly, using model composition, we first train individual priors to complete motions that realize a prescribed motion for a given joint. We then introduce DiffusionBlending, an interpolation mechanism to effectively blend several such models to enable flexible and efficient fine-grained joint and trajectory-level control and editing. We evaluate the composition methods using an off-the-shelf motion diffusion model, and further compare the results to dedicated models trained for these specific tasks.

Results

Task	Dataset	Metric	Value	Model
Pose Tracking	Inter-X	FID	29.266	ComMDM
Pose Tracking	Inter-X	MMDist	6.87	ComMDM
Pose Tracking	Inter-X	MModality	0.771	ComMDM
Pose Tracking	Inter-X	R-Precision Top3	0.236	ComMDM
Pose Tracking	InterHuman	FID	7.069	ComMDM
Pose Tracking	InterHuman	MMDist	6.212	ComMDM
Pose Tracking	InterHuman	MModality	1.822	ComMDM
Pose Tracking	InterHuman	R-Precision Top3	0.466	ComMDM
Motion Synthesis	Inter-X	FID	29.266	ComMDM
Motion Synthesis	Inter-X	MMDist	6.87	ComMDM
Motion Synthesis	Inter-X	MModality	0.771	ComMDM
Motion Synthesis	Inter-X	R-Precision Top3	0.236	ComMDM
Motion Synthesis	InterHuman	FID	7.069	ComMDM
Motion Synthesis	InterHuman	MMDist	6.212	ComMDM
Motion Synthesis	InterHuman	MModality	1.822	ComMDM
Motion Synthesis	InterHuman	R-Precision Top3	0.466	ComMDM
10-shot image generation	Inter-X	FID	29.266	ComMDM
10-shot image generation	Inter-X	MMDist	6.87	ComMDM
10-shot image generation	Inter-X	MModality	0.771	ComMDM
10-shot image generation	Inter-X	R-Precision Top3	0.236	ComMDM
10-shot image generation	InterHuman	FID	7.069	ComMDM
10-shot image generation	InterHuman	MMDist	6.212	ComMDM
10-shot image generation	InterHuman	MModality	1.822	ComMDM
10-shot image generation	InterHuman	R-Precision Top3	0.466	ComMDM
3D Human Pose Tracking	Inter-X	FID	29.266	ComMDM
3D Human Pose Tracking	Inter-X	MMDist	6.87	ComMDM
3D Human Pose Tracking	Inter-X	MModality	0.771	ComMDM
3D Human Pose Tracking	Inter-X	R-Precision Top3	0.236	ComMDM
3D Human Pose Tracking	InterHuman	FID	7.069	ComMDM
3D Human Pose Tracking	InterHuman	MMDist	6.212	ComMDM
3D Human Pose Tracking	InterHuman	MModality	1.822	ComMDM
3D Human Pose Tracking	InterHuman	R-Precision Top3	0.466	ComMDM

Human Motion Diffusion as a Generative Prior

Abstract

Results

Related Papers

Human Motion Diffusion as a Generative Prior

Abstract

Results

Related Papers