InterGen: Diffusion-based Multi-human Motion Generation under Complex Interactions

Han Liang, Wenqian Zhang, Wenxuan Li, Jingyi Yu, Lan Xu

2023-04-12Denoising Motion Generation Motion Synthesis

Abstract

We have recently seen tremendous progress in diffusion advances for generating realistic human motions. Yet, they largely disregard the multi-human interactions. In this paper, we present InterGen, an effective diffusion-based approach that incorporates human-to-human interactions into the motion diffusion process, which enables layman users to customize high-quality two-person interaction motions, with only text guidance. We first contribute a multimodal dataset, named InterHuman. It consists of about 107M frames for diverse two-person interactions, with accurate skeletal motions and 23,337 natural language descriptions. For the algorithm side, we carefully tailor the motion diffusion model to our two-person interaction setting. To handle the symmetry of human identities during interactions, we propose two cooperative transformer-based denoisers that explicitly share weights, with a mutual attention mechanism to further connect the two denoising processes. Then, we propose a novel representation for motion input in our interaction diffusion model, which explicitly formulates the global relations between the two performers in the world frame. We further introduce two novel regularization terms to encode spatial relations, equipped with a corresponding damping scheme during the training of our interaction diffusion model. Extensive experiments validate the effectiveness and generalizability of InterGen. Notably, it can generate more diverse and compelling two-person motions than previous methods and enables various downstream applications for human interactions.

Results

Task	Dataset	Metric	Value	Model
Pose Tracking	Inter-X	FID	5.207	InterGen
Pose Tracking	Inter-X	MMDist	9.58	InterGen
Pose Tracking	Inter-X	MModality	3.686	InterGen
Pose Tracking	Inter-X	R-Precision Top3	0.429	InterGen
Pose Tracking	InterHuman	FID	5.918	InterGen
Pose Tracking	InterHuman	MMDist	5.108	InterGen
Pose Tracking	InterHuman	MModality	2.141	InterGen
Pose Tracking	InterHuman	R-Precision Top3	0.624	InterGen
Motion Synthesis	Inter-X	FID	5.207	InterGen
Motion Synthesis	Inter-X	MMDist	9.58	InterGen
Motion Synthesis	Inter-X	MModality	3.686	InterGen
Motion Synthesis	Inter-X	R-Precision Top3	0.429	InterGen
Motion Synthesis	InterHuman	FID	5.918	InterGen
Motion Synthesis	InterHuman	MMDist	5.108	InterGen
Motion Synthesis	InterHuman	MModality	2.141	InterGen
Motion Synthesis	InterHuman	R-Precision Top3	0.624	InterGen
10-shot image generation	Inter-X	FID	5.207	InterGen
10-shot image generation	Inter-X	MMDist	9.58	InterGen
10-shot image generation	Inter-X	MModality	3.686	InterGen
10-shot image generation	Inter-X	R-Precision Top3	0.429	InterGen
10-shot image generation	InterHuman	FID	5.918	InterGen
10-shot image generation	InterHuman	MMDist	5.108	InterGen
10-shot image generation	InterHuman	MModality	2.141	InterGen
10-shot image generation	InterHuman	R-Precision Top3	0.624	InterGen
3D Human Pose Tracking	Inter-X	FID	5.207	InterGen
3D Human Pose Tracking	Inter-X	MMDist	9.58	InterGen
3D Human Pose Tracking	Inter-X	MModality	3.686	InterGen
3D Human Pose Tracking	Inter-X	R-Precision Top3	0.429	InterGen
3D Human Pose Tracking	InterHuman	FID	5.918	InterGen
3D Human Pose Tracking	InterHuman	MMDist	5.108	InterGen
3D Human Pose Tracking	InterHuman	MModality	2.141	InterGen
3D Human Pose Tracking	InterHuman	R-Precision Top3	0.624	InterGen

Abstract

Results

Task	Dataset	Metric	Value	Model
Pose Tracking	Inter-X	FID	5.207	InterGen
Pose Tracking	Inter-X	MMDist	9.58	InterGen
Pose Tracking	Inter-X	MModality	3.686	InterGen
Pose Tracking	Inter-X	R-Precision Top3	0.429	InterGen
Pose Tracking	InterHuman	FID	5.918	InterGen
Pose Tracking	InterHuman	MMDist	5.108	InterGen
Pose Tracking	InterHuman	MModality	2.141	InterGen
Pose Tracking	InterHuman	R-Precision Top3	0.624	InterGen
Motion Synthesis	Inter-X	FID	5.207	InterGen
Motion Synthesis	Inter-X	MMDist	9.58	InterGen
Motion Synthesis	Inter-X	MModality	3.686	InterGen
Motion Synthesis	Inter-X	R-Precision Top3	0.429	InterGen
Motion Synthesis	InterHuman	FID	5.918	InterGen
Motion Synthesis	InterHuman	MMDist	5.108	InterGen
Motion Synthesis	InterHuman	MModality	2.141	InterGen
Motion Synthesis	InterHuman	R-Precision Top3	0.624	InterGen
10-shot image generation	Inter-X	FID	5.207	InterGen
10-shot image generation	Inter-X	MMDist	9.58	InterGen
10-shot image generation	Inter-X	MModality	3.686	InterGen
10-shot image generation	Inter-X	R-Precision Top3	0.429	InterGen
10-shot image generation	InterHuman	FID	5.918	InterGen
10-shot image generation	InterHuman	MMDist	5.108	InterGen
10-shot image generation	InterHuman	MModality	2.141	InterGen
10-shot image generation	InterHuman	R-Precision Top3	0.624	InterGen
3D Human Pose Tracking	Inter-X	FID	5.207	InterGen
3D Human Pose Tracking	Inter-X	MMDist	9.58	InterGen
3D Human Pose Tracking	Inter-X	MModality	3.686	InterGen
3D Human Pose Tracking	Inter-X	R-Precision Top3	0.429	InterGen
3D Human Pose Tracking	InterHuman	FID	5.918	InterGen
3D Human Pose Tracking	InterHuman	MMDist	5.108	InterGen
3D Human Pose Tracking	InterHuman	MModality	2.141	InterGen
3D Human Pose Tracking	InterHuman	R-Precision Top3	0.624	InterGen

InterGen: Diffusion-based Multi-human Motion Generation under Complex Interactions

Abstract

Results

Related Papers

InterGen: Diffusion-based Multi-human Motion Generation under Complex Interactions

Abstract

Results

Related Papers