TCPFormer: Learning Temporal Correlation with Implicit Pose Proxy for 3D Human Pose Estimation

Jiajie Liu, Mengyuan Liu, Hong Liu, Wenhao Li

2025-01-033D Human Pose Estimation Monocular 3D Human Pose Estimation Pose Estimation

Abstract

Recent multi-frame lifting methods have dominated the 3D human pose estimation. However, previous methods ignore the intricate dependence within the 2D pose sequence and learn single temporal correlation. To alleviate this limitation, we propose TCPFormer, which leverages an implicit pose proxy as an intermediate representation. Each proxy within the implicit pose proxy can build one temporal correlation therefore helping us learn more comprehensive temporal correlation of human motion. Specifically, our method consists of three key components: Proxy Update Module (PUM), Proxy Invocation Module (PIM), and Proxy Attention Module (PAM). PUM first uses pose features to update the implicit pose proxy, enabling it to store representative information from the pose sequence. PIM then invocates and integrates the pose proxy with the pose sequence to enhance the motion semantics of each pose. Finally, PAM leverages the above mapping between the pose sequence and pose proxy to enhance the temporal correlation of the whole pose sequence. Experiments on the Human3.6M and MPI-INF-3DHP datasets demonstrate that our proposed TCPFormer outperforms the previous state-of-the-art methods.

Results

Task	Dataset	Metric	Value	Model
3D Human Pose Estimation	MPI-INF-3DHP	AUC	87.7	TCPFormer (T=81)
3D Human Pose Estimation	MPI-INF-3DHP	MPJPE	15	TCPFormer (T=81)
3D Human Pose Estimation	MPI-INF-3DHP	PCK	99	TCPFormer (T=81)
3D Human Pose Estimation	MPI-INF-3DHP	AUC	86.5	TCPFormer (T=27)
3D Human Pose Estimation	MPI-INF-3DHP	MPJPE	17.8	TCPFormer (T=27)
3D Human Pose Estimation	MPI-INF-3DHP	PCK	98.7	TCPFormer (T=27)
Pose Estimation	MPI-INF-3DHP	AUC	87.7	TCPFormer (T=81)
Pose Estimation	MPI-INF-3DHP	MPJPE	15	TCPFormer (T=81)
Pose Estimation	MPI-INF-3DHP	PCK	99	TCPFormer (T=81)
Pose Estimation	MPI-INF-3DHP	AUC	86.5	TCPFormer (T=27)
Pose Estimation	MPI-INF-3DHP	MPJPE	17.8	TCPFormer (T=27)
Pose Estimation	MPI-INF-3DHP	PCK	98.7	TCPFormer (T=27)
3D	MPI-INF-3DHP	AUC	87.7	TCPFormer (T=81)
3D	MPI-INF-3DHP	MPJPE	15	TCPFormer (T=81)
3D	MPI-INF-3DHP	PCK	99	TCPFormer (T=81)
3D	MPI-INF-3DHP	AUC	86.5	TCPFormer (T=27)
3D	MPI-INF-3DHP	MPJPE	17.8	TCPFormer (T=27)
3D	MPI-INF-3DHP	PCK	98.7	TCPFormer (T=27)
1 Image, 2*2 Stitchi	MPI-INF-3DHP	AUC	87.7	TCPFormer (T=81)
1 Image, 2*2 Stitchi	MPI-INF-3DHP	MPJPE	15	TCPFormer (T=81)
1 Image, 2*2 Stitchi	MPI-INF-3DHP	PCK	99	TCPFormer (T=81)
1 Image, 2*2 Stitchi	MPI-INF-3DHP	AUC	86.5	TCPFormer (T=27)
1 Image, 2*2 Stitchi	MPI-INF-3DHP	MPJPE	17.8	TCPFormer (T=27)
1 Image, 2*2 Stitchi	MPI-INF-3DHP	PCK	98.7	TCPFormer (T=27)

TCPFormer: Learning Temporal Correlation with Implicit Pose Proxy for 3D Human Pose Estimation

Abstract

Results

Related Papers

TCPFormer: Learning Temporal Correlation with Implicit Pose Proxy for 3D Human Pose Estimation

Abstract

Results

Related Papers