UDE: A Unified Driving Engine for Human Motion Generation

Zixiang Zhou, Baoyuan Wang

2022-11-29CVPR 2023 1Quantization Motion Generation Motion Synthesis

Abstract

Generating controllable and editable human motion sequences is a key challenge in 3D Avatar generation. It has been labor-intensive to generate and animate human motion for a long time until learning-based approaches have been developed and applied recently. However, these approaches are still task-specific or modality-specific\cite {ahuja2019language2pose}\cite{ghosh2021synthesis}\cite{ferreira2021learning}\cite{li2021ai}. In this paper, we propose ``UDE", the first unified driving engine that enables generating human motion sequences from natural language or audio sequences (see Fig.~\ref{fig:teaser}). Specifically, UDE consists of the following key components: 1) a motion quantization module based on VQVAE that represents continuous motion sequence as discrete latent code\cite{van2017neural}, 2) a modality-agnostic transformer encoder\cite{vaswani2017attention} that learns to map modality-aware driving signals to a joint space, and 3) a unified token transformer (GPT-like\cite{radford2019language}) network to predict the quantized latent code index in an auto-regressive manner. 4) a diffusion motion decoder that takes as input the motion tokens and decodes them into motion sequences with high diversity. We evaluate our method on HumanML3D\cite{Guo_2022_CVPR} and AIST++\cite{li2021learn} benchmarks, and the experiment results demonstrate our method achieves state-of-the-art performance. Project website: \url{https://github.com/zixiangzhou916/UDE/

Results

Task	Dataset	Metric	Value	Model
Pose Tracking	AIST++	Beat alignment score	0.2311	UDE
Pose Tracking	AIST++	FID	17.25	UDE
Motion Synthesis	AIST++	Beat alignment score	0.2311	UDE
Motion Synthesis	AIST++	FID	17.25	UDE
10-shot image generation	AIST++	Beat alignment score	0.2311	UDE
10-shot image generation	AIST++	FID	17.25	UDE
3D Human Pose Tracking	AIST++	Beat alignment score	0.2311	UDE
3D Human Pose Tracking	AIST++	FID	17.25	UDE

UDE: A Unified Driving Engine for Human Motion Generation

Abstract

Results

Related Papers

UDE: A Unified Driving Engine for Human Motion Generation

Abstract

Results

Related Papers