KTPFormer: Kinematics and Trajectory Prior Knowledge-Enhanced Transformer for 3D Human Pose Estimation

Jihua Peng, Yanghong Zhou, P. Y. Mok

2024-03-31CVPR 2024 13D Human Pose Estimation Monocular 3D Human Pose Estimation Pose Estimation

Abstract

This paper presents a novel Kinematics and Trajectory Prior Knowledge-Enhanced Transformer (KTPFormer), which overcomes the weakness in existing transformer-based methods for 3D human pose estimation that the derivation of Q, K, V vectors in their self-attention mechanisms are all based on simple linear mapping. We propose two prior attention modules, namely Kinematics Prior Attention (KPA) and Trajectory Prior Attention (TPA) to take advantage of the known anatomical structure of the human body and motion trajectory information, to facilitate effective learning of global dependencies and features in the multi-head self-attention. KPA models kinematic relationships in the human body by constructing a topology of kinematics, while TPA builds a trajectory topology to learn the information of joint motion trajectory across frames. Yielding Q, K, V vectors with prior knowledge, the two modules enable KTPFormer to model both spatial and temporal correlations simultaneously. Extensive experiments on three benchmarks (Human3.6M, MPI-INF-3DHP and HumanEva) show that KTPFormer achieves superior performance in comparison to state-of-the-art methods. More importantly, our KPA and TPA modules have lightweight plug-and-play designs and can be integrated into various transformer-based networks (i.e., diffusion-based) to improve the performance with only a very small increase in the computational overhead. The code is available at: https://github.com/JihuaPeng/KTPFormer.

Results

Task	Dataset	Metric	Value	Model
3D Human Pose Estimation	MPI-INF-3DHP	AUC	85.9	KTPFormer
3D Human Pose Estimation	MPI-INF-3DHP	MPJPE	16.7	KTPFormer
3D Human Pose Estimation	MPI-INF-3DHP	PCK	98.9	KTPFormer
3D Human Pose Estimation	Human3.6M	Average MPJPE (mm)	33	KTPFormer (T=243)
3D Human Pose Estimation	Human3.6M	PA-MPJPE	26.2	KTPFormer (T=243)
3D Human Pose Estimation	Human3.6M	Average MPJPE (mm)	40.1	KTPFormer
3D Human Pose Estimation	Human3.6M	Frames Needed	243	KTPFormer
Pose Estimation	MPI-INF-3DHP	AUC	85.9	KTPFormer
Pose Estimation	MPI-INF-3DHP	MPJPE	16.7	KTPFormer
Pose Estimation	MPI-INF-3DHP	PCK	98.9	KTPFormer
Pose Estimation	Human3.6M	Average MPJPE (mm)	33	KTPFormer (T=243)
Pose Estimation	Human3.6M	PA-MPJPE	26.2	KTPFormer (T=243)
Pose Estimation	Human3.6M	Average MPJPE (mm)	40.1	KTPFormer
Pose Estimation	Human3.6M	Frames Needed	243	KTPFormer
3D	MPI-INF-3DHP	AUC	85.9	KTPFormer
3D	MPI-INF-3DHP	MPJPE	16.7	KTPFormer
3D	MPI-INF-3DHP	PCK	98.9	KTPFormer
3D	Human3.6M	Average MPJPE (mm)	33	KTPFormer (T=243)
3D	Human3.6M	PA-MPJPE	26.2	KTPFormer (T=243)
3D	Human3.6M	Average MPJPE (mm)	40.1	KTPFormer
3D	Human3.6M	Frames Needed	243	KTPFormer
1 Image, 2*2 Stitchi	MPI-INF-3DHP	AUC	85.9	KTPFormer
1 Image, 2*2 Stitchi	MPI-INF-3DHP	MPJPE	16.7	KTPFormer
1 Image, 2*2 Stitchi	MPI-INF-3DHP	PCK	98.9	KTPFormer
1 Image, 2*2 Stitchi	Human3.6M	Average MPJPE (mm)	33	KTPFormer (T=243)
1 Image, 2*2 Stitchi	Human3.6M	PA-MPJPE	26.2	KTPFormer (T=243)
1 Image, 2*2 Stitchi	Human3.6M	Average MPJPE (mm)	40.1	KTPFormer
1 Image, 2*2 Stitchi	Human3.6M	Frames Needed	243	KTPFormer

KTPFormer: Kinematics and Trajectory Prior Knowledge-Enhanced Transformer for 3D Human Pose Estimation

Abstract

Results

Related Papers

KTPFormer: Kinematics and Trajectory Prior Knowledge-Enhanced Transformer for 3D Human Pose Estimation

Abstract

Results

Related Papers