Temporal-Aware Refinement for Video-based Human Pose and Shape Recovery

Ming Chen, Yan Zhou, Weihua Jian, Pengfei Wan, Zhongyuan Wang

2023-11-163D Human Pose Estimation TAR

Abstract

Though significant progress in human pose and shape recovery from monocular RGB images has been made in recent years, obtaining 3D human motion with high accuracy and temporal consistency from videos remains challenging. Existing video-based methods tend to reconstruct human motion from global image features, which lack detailed representation capability and limit the reconstruction accuracy. In this paper, we propose a Temporal-Aware Refining Network (TAR), to synchronously explore temporal-aware global and local image features for accurate pose and shape recovery. First, a global transformer encoder is introduced to obtain temporal global features from static feature sequences. Second, a bidirectional ConvGRU network takes the sequence of high-resolution feature maps as input, and outputs temporal local feature maps that maintain high resolution and capture the local motion of the human body. Finally, a recurrent refinement module iteratively updates estimated SMPL parameters by leveraging both global and local temporal information to achieve accurate and smooth results. Extensive experiments demonstrate that our TAR obtains more accurate results than previous state-of-the-art methods on popular benchmarks, i.e., 3DPW, MPI-INF-3DHP, and Human3.6M.

Results

Task	Dataset	Metric	Value	Model
3D Human Pose Estimation	MPI-INF-3DHP	Acceleration Error	9.2	TAR (N=9)
3D Human Pose Estimation	MPI-INF-3DHP	MPJPE	85.9	TAR (N=9)
3D Human Pose Estimation	MPI-INF-3DHP	PA-MPJPE	60.5	TAR (N=9)
3D Human Pose Estimation	3DPW	Acceleration Error	7.7	TAR (N=9)
3D Human Pose Estimation	3DPW	MPJPE	62.7	TAR (N=9)
3D Human Pose Estimation	3DPW	MPVPE	74.4	TAR (N=9)
3D Human Pose Estimation	3DPW	PA-MPJPE	40.6	TAR (N=9)
Pose Estimation	MPI-INF-3DHP	Acceleration Error	9.2	TAR (N=9)
Pose Estimation	MPI-INF-3DHP	MPJPE	85.9	TAR (N=9)
Pose Estimation	MPI-INF-3DHP	PA-MPJPE	60.5	TAR (N=9)
Pose Estimation	3DPW	Acceleration Error	7.7	TAR (N=9)
Pose Estimation	3DPW	MPJPE	62.7	TAR (N=9)
Pose Estimation	3DPW	MPVPE	74.4	TAR (N=9)
Pose Estimation	3DPW	PA-MPJPE	40.6	TAR (N=9)
3D	MPI-INF-3DHP	Acceleration Error	9.2	TAR (N=9)
3D	MPI-INF-3DHP	MPJPE	85.9	TAR (N=9)
3D	MPI-INF-3DHP	PA-MPJPE	60.5	TAR (N=9)
3D	3DPW	Acceleration Error	7.7	TAR (N=9)
3D	3DPW	MPJPE	62.7	TAR (N=9)
3D	3DPW	MPVPE	74.4	TAR (N=9)
3D	3DPW	PA-MPJPE	40.6	TAR (N=9)
1 Image, 2*2 Stitchi	MPI-INF-3DHP	Acceleration Error	9.2	TAR (N=9)
1 Image, 2*2 Stitchi	MPI-INF-3DHP	MPJPE	85.9	TAR (N=9)
1 Image, 2*2 Stitchi	MPI-INF-3DHP	PA-MPJPE	60.5	TAR (N=9)
1 Image, 2*2 Stitchi	3DPW	Acceleration Error	7.7	TAR (N=9)
1 Image, 2*2 Stitchi	3DPW	MPJPE	62.7	TAR (N=9)
1 Image, 2*2 Stitchi	3DPW	MPVPE	74.4	TAR (N=9)
1 Image, 2*2 Stitchi	3DPW	PA-MPJPE	40.6	TAR (N=9)

Temporal-Aware Refinement for Video-based Human Pose and Shape Recovery

Abstract

Results

Related Papers

Temporal-Aware Refinement for Video-based Human Pose and Shape Recovery

Abstract

Results

Related Papers