TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Exploiting temporal information for 3D pose estimation

Exploiting temporal information for 3D pose estimation

Mir Rayat Imtiaz Hossain, James J. Little

2017-11-233D Human Pose EstimationPose Estimation3D Pose Estimation
PaperPDFCode

Abstract

In this work, we address the problem of 3D human pose estimation from a sequence of 2D human poses. Although the recent success of deep networks has led many state-of-the-art methods for 3D pose estimation to train deep networks end-to-end to predict from images directly, the top-performing approaches have shown the effectiveness of dividing the task of 3D pose estimation into two steps: using a state-of-the-art 2D pose estimator to estimate the 2D pose from images and then mapping them into 3D space. They also showed that a low-dimensional representation like 2D locations of a set of joints can be discriminative enough to estimate 3D pose with high accuracy. However, estimation of 3D pose for individual frames leads to temporally incoherent estimates due to independent error in each frame causing jitter. Therefore, in this work we utilize the temporal information across a sequence of 2D joint locations to estimate a sequence of 3D poses. We designed a sequence-to-sequence network composed of layer-normalized LSTM units with shortcut connections connecting the input to the output on the decoder side and imposed temporal smoothness constraint during training. We found that the knowledge of temporal consistency improves the best reported result on Human3.6M dataset by approximately $12.2\%$ and helps our network to recover temporally consistent 3D poses over a sequence of images even when the 2D pose detector fails.

Results

TaskDatasetMetricValueModel
3D Human Pose EstimationHumanEva-IMean Reconstruction Error (mm)22Sequence-to-sequence network
3D Human Pose EstimationHuman3.6MAverage MPJPE (mm)58.5Sequence-to-sequence network
Pose EstimationHumanEva-IMean Reconstruction Error (mm)22Sequence-to-sequence network
Pose EstimationHuman3.6MAverage MPJPE (mm)58.5Sequence-to-sequence network
3DHumanEva-IMean Reconstruction Error (mm)22Sequence-to-sequence network
3DHuman3.6MAverage MPJPE (mm)58.5Sequence-to-sequence network
1 Image, 2*2 StitchiHumanEva-IMean Reconstruction Error (mm)22Sequence-to-sequence network
1 Image, 2*2 StitchiHuman3.6MAverage MPJPE (mm)58.5Sequence-to-sequence network

Related Papers

$π^3$: Scalable Permutation-Equivariant Visual Geometry Learning2025-07-17Revisiting Reliability in the Reasoning-based Pose Estimation Benchmark2025-07-17DINO-VO: A Feature-based Visual Odometry Leveraging a Visual Foundation Model2025-07-17From Neck to Head: Bio-Impedance Sensing for Head Pose Estimation2025-07-17AthleticsPose: Authentic Sports Motion Dataset on Athletic Field and Evaluation of Monocular 3D Pose Estimation Ability2025-07-17SpatialTrackerV2: 3D Point Tracking Made Easy2025-07-16SGLoc: Semantic Localization System for Camera Pose Estimation from 3D Gaussian Splatting Representation2025-07-16Efficient Calisthenics Skills Classification through Foreground Instance Selection and Depth Estimation2025-07-16