TransFusion: A Practical and Effective Transformer-based Diffusion Model for 3D Human Motion Prediction

Sibo Tian, Minghui Zheng, Xiao Liang

2023-07-30Human Pose Forecasting Human motion prediction motion prediction

Abstract

Predicting human motion plays a crucial role in ensuring a safe and effective human-robot close collaboration in intelligent remanufacturing systems of the future. Existing works can be categorized into two groups: those focusing on accuracy, predicting a single future motion, and those generating diverse predictions based on observations. The former group fails to address the uncertainty and multi-modal nature of human motion, while the latter group often produces motion sequences that deviate too far from the ground truth or become unrealistic within historical contexts. To tackle these issues, we propose TransFusion, an innovative and practical diffusion-based model for 3D human motion prediction which can generate samples that are more likely to happen while maintaining a certain level of diversity. Our model leverages Transformer as the backbone with long skip connections between shallow and deep layers. Additionally, we employ the discrete cosine transform to model motion sequences in the frequency space, thereby improving performance. In contrast to prior diffusion-based models that utilize extra modules like cross-attention and adaptive layer normalization to condition the prediction on past observed motion, we treat all inputs, including conditions, as tokens to create a more lightweight model compared to existing approaches. Extensive experimental studies are conducted on benchmark datasets to validate the effectiveness of our human motion prediction model.

Results

Task	Dataset	Metric	Value	Model
Pose Estimation	AMASS	ADE	0.508	TransFusion
Pose Estimation	AMASS	APD	8.853	TransFusion
Pose Estimation	AMASS	FDE	0.568	TransFusion
Pose Estimation	Human3.6M	ADE	358	TransFusion
Pose Estimation	Human3.6M	APD	5975	TransFusion
Pose Estimation	Human3.6M	FDE	468	TransFusion
Pose Estimation	Human3.6M	MMADE	506	TransFusion
Pose Estimation	Human3.6M	MMFDE	539	TransFusion
Pose Estimation	HumanEva-I	ADE@2000ms	204	TransFusion
Pose Estimation	HumanEva-I	APD@2000ms	1031	TransFusion
Pose Estimation	HumanEva-I	FDE@2000ms	234	TransFusion
Pose Estimation	HumanEva-I	MMADE@2000ms	408	TransFusion
Pose Estimation	HumanEva-I	MMFDE@2000ms	427	TransFusion
3D	AMASS	ADE	0.508	TransFusion
3D	AMASS	APD	8.853	TransFusion
3D	AMASS	FDE	0.568	TransFusion
3D	Human3.6M	ADE	358	TransFusion
3D	Human3.6M	APD	5975	TransFusion
3D	Human3.6M	FDE	468	TransFusion
3D	Human3.6M	MMADE	506	TransFusion
3D	Human3.6M	MMFDE	539	TransFusion
3D	HumanEva-I	ADE@2000ms	204	TransFusion
3D	HumanEva-I	APD@2000ms	1031	TransFusion
3D	HumanEva-I	FDE@2000ms	234	TransFusion
3D	HumanEva-I	MMADE@2000ms	408	TransFusion
3D	HumanEva-I	MMFDE@2000ms	427	TransFusion
1 Image, 2*2 Stitchi	AMASS	ADE	0.508	TransFusion
1 Image, 2*2 Stitchi	AMASS	APD	8.853	TransFusion
1 Image, 2*2 Stitchi	AMASS	FDE	0.568	TransFusion
1 Image, 2*2 Stitchi	Human3.6M	ADE	358	TransFusion
1 Image, 2*2 Stitchi	Human3.6M	APD	5975	TransFusion
1 Image, 2*2 Stitchi	Human3.6M	FDE	468	TransFusion
1 Image, 2*2 Stitchi	Human3.6M	MMADE	506	TransFusion
1 Image, 2*2 Stitchi	Human3.6M	MMFDE	539	TransFusion
1 Image, 2*2 Stitchi	HumanEva-I	ADE@2000ms	204	TransFusion
1 Image, 2*2 Stitchi	HumanEva-I	APD@2000ms	1031	TransFusion
1 Image, 2*2 Stitchi	HumanEva-I	FDE@2000ms	234	TransFusion
1 Image, 2*2 Stitchi	HumanEva-I	MMADE@2000ms	408	TransFusion
1 Image, 2*2 Stitchi	HumanEva-I	MMFDE@2000ms	427	TransFusion

Abstract

Results

Task	Dataset	Metric	Value	Model
Pose Estimation	AMASS	ADE	0.508	TransFusion
Pose Estimation	AMASS	APD	8.853	TransFusion
Pose Estimation	AMASS	FDE	0.568	TransFusion
Pose Estimation	Human3.6M	ADE	358	TransFusion
Pose Estimation	Human3.6M	APD	5975	TransFusion
Pose Estimation	Human3.6M	FDE	468	TransFusion
Pose Estimation	Human3.6M	MMADE	506	TransFusion
Pose Estimation	Human3.6M	MMFDE	539	TransFusion
Pose Estimation	HumanEva-I	ADE@2000ms	204	TransFusion
Pose Estimation	HumanEva-I	APD@2000ms	1031	TransFusion
Pose Estimation	HumanEva-I	FDE@2000ms	234	TransFusion
Pose Estimation	HumanEva-I	MMADE@2000ms	408	TransFusion
Pose Estimation	HumanEva-I	MMFDE@2000ms	427	TransFusion
3D	AMASS	ADE	0.508	TransFusion
3D	AMASS	APD	8.853	TransFusion
3D	AMASS	FDE	0.568	TransFusion
3D	Human3.6M	ADE	358	TransFusion
3D	Human3.6M	APD	5975	TransFusion
3D	Human3.6M	FDE	468	TransFusion
3D	Human3.6M	MMADE	506	TransFusion
3D	Human3.6M	MMFDE	539	TransFusion
3D	HumanEva-I	ADE@2000ms	204	TransFusion
3D	HumanEva-I	APD@2000ms	1031	TransFusion
3D	HumanEva-I	FDE@2000ms	234	TransFusion
3D	HumanEva-I	MMADE@2000ms	408	TransFusion
3D	HumanEva-I	MMFDE@2000ms	427	TransFusion
1 Image, 2*2 Stitchi	AMASS	ADE	0.508	TransFusion
1 Image, 2*2 Stitchi	AMASS	APD	8.853	TransFusion
1 Image, 2*2 Stitchi	AMASS	FDE	0.568	TransFusion
1 Image, 2*2 Stitchi	Human3.6M	ADE	358	TransFusion
1 Image, 2*2 Stitchi	Human3.6M	APD	5975	TransFusion
1 Image, 2*2 Stitchi	Human3.6M	FDE	468	TransFusion
1 Image, 2*2 Stitchi	Human3.6M	MMADE	506	TransFusion
1 Image, 2*2 Stitchi	Human3.6M	MMFDE	539	TransFusion
1 Image, 2*2 Stitchi	HumanEva-I	ADE@2000ms	204	TransFusion
1 Image, 2*2 Stitchi	HumanEva-I	APD@2000ms	1031	TransFusion
1 Image, 2*2 Stitchi	HumanEva-I	FDE@2000ms	234	TransFusion
1 Image, 2*2 Stitchi	HumanEva-I	MMADE@2000ms	408	TransFusion
1 Image, 2*2 Stitchi	HumanEva-I	MMFDE@2000ms	427	TransFusion

TransFusion: A Practical and Effective Transformer-based Diffusion Model for 3D Human Motion Prediction

Abstract

Results

Related Papers

TransFusion: A Practical and Effective Transformer-based Diffusion Model for 3D Human Motion Prediction

Abstract

Results

Related Papers