TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Temporal-Aware Refinement for Video-based Human Pose and S...

Temporal-Aware Refinement for Video-based Human Pose and Shape Recovery

Ming Chen, Yan Zhou, Weihua Jian, Pengfei Wan, Zhongyuan Wang

2023-11-163D Human Pose EstimationTAR
PaperPDF

Abstract

Though significant progress in human pose and shape recovery from monocular RGB images has been made in recent years, obtaining 3D human motion with high accuracy and temporal consistency from videos remains challenging. Existing video-based methods tend to reconstruct human motion from global image features, which lack detailed representation capability and limit the reconstruction accuracy. In this paper, we propose a Temporal-Aware Refining Network (TAR), to synchronously explore temporal-aware global and local image features for accurate pose and shape recovery. First, a global transformer encoder is introduced to obtain temporal global features from static feature sequences. Second, a bidirectional ConvGRU network takes the sequence of high-resolution feature maps as input, and outputs temporal local feature maps that maintain high resolution and capture the local motion of the human body. Finally, a recurrent refinement module iteratively updates estimated SMPL parameters by leveraging both global and local temporal information to achieve accurate and smooth results. Extensive experiments demonstrate that our TAR obtains more accurate results than previous state-of-the-art methods on popular benchmarks, i.e., 3DPW, MPI-INF-3DHP, and Human3.6M.

Results

TaskDatasetMetricValueModel
3D Human Pose EstimationMPI-INF-3DHPAcceleration Error9.2TAR (N=9)
3D Human Pose EstimationMPI-INF-3DHPMPJPE85.9TAR (N=9)
3D Human Pose EstimationMPI-INF-3DHPPA-MPJPE60.5TAR (N=9)
3D Human Pose Estimation3DPWAcceleration Error7.7TAR (N=9)
3D Human Pose Estimation3DPWMPJPE62.7TAR (N=9)
3D Human Pose Estimation3DPWMPVPE74.4TAR (N=9)
3D Human Pose Estimation3DPWPA-MPJPE40.6TAR (N=9)
Pose EstimationMPI-INF-3DHPAcceleration Error9.2TAR (N=9)
Pose EstimationMPI-INF-3DHPMPJPE85.9TAR (N=9)
Pose EstimationMPI-INF-3DHPPA-MPJPE60.5TAR (N=9)
Pose Estimation3DPWAcceleration Error7.7TAR (N=9)
Pose Estimation3DPWMPJPE62.7TAR (N=9)
Pose Estimation3DPWMPVPE74.4TAR (N=9)
Pose Estimation3DPWPA-MPJPE40.6TAR (N=9)
3DMPI-INF-3DHPAcceleration Error9.2TAR (N=9)
3DMPI-INF-3DHPMPJPE85.9TAR (N=9)
3DMPI-INF-3DHPPA-MPJPE60.5TAR (N=9)
3D3DPWAcceleration Error7.7TAR (N=9)
3D3DPWMPJPE62.7TAR (N=9)
3D3DPWMPVPE74.4TAR (N=9)
3D3DPWPA-MPJPE40.6TAR (N=9)
1 Image, 2*2 StitchiMPI-INF-3DHPAcceleration Error9.2TAR (N=9)
1 Image, 2*2 StitchiMPI-INF-3DHPMPJPE85.9TAR (N=9)
1 Image, 2*2 StitchiMPI-INF-3DHPPA-MPJPE60.5TAR (N=9)
1 Image, 2*2 Stitchi3DPWAcceleration Error7.7TAR (N=9)
1 Image, 2*2 Stitchi3DPWMPJPE62.7TAR (N=9)
1 Image, 2*2 Stitchi3DPWMPVPE74.4TAR (N=9)
1 Image, 2*2 Stitchi3DPWPA-MPJPE40.6TAR (N=9)

Related Papers

Systematic Comparison of Projection Methods for Monocular 3D Human Pose Estimation on Fisheye Images2025-06-24Vision as a Dialect: Unifying Visual Understanding and Generation via Text-Aligned Representations2025-06-23ExtPose: Robust and Coherent Pose Estimation by Extending ViTs2025-06-18InfiniPot-V: Memory-Constrained KV Cache Compression for Streaming Video Understanding2025-06-18PoseGRAF: Geometric-Reinforced Adaptive Fusion for Monocular 3D Human Pose Estimation2025-06-17FinLMM-R1: Enhancing Financial Reasoning in LMM through Scalable Data and Reward Design2025-06-16Robust LLM Unlearning with MUDMAN: Meta-Unlearning with Disruption Masking And Normalization2025-06-14Prompt Attacks Reveal Superficial Knowledge Removal in Unlearning Methods2025-06-11