TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Kinematic-Structure-Preserved Representation for Unsupervi...

Kinematic-Structure-Preserved Representation for Unsupervised 3D Human Pose Estimation

Jogendra Nath Kundu, Siddharth Seth, Rahul M. V, Mugalodi Rakesh, R. Venkatesh Babu, Anirban Chakraborty

2020-06-243D Human Pose EstimationDisentanglementUnsupervised 3D Human Pose EstimationPose Estimation3D Pose Estimation
PaperPDF

Abstract

Estimation of 3D human pose from monocular image has gained considerable attention, as a key step to several human-centric applications. However, generalizability of human pose estimation models developed using supervision on large-scale in-studio datasets remains questionable, as these models often perform unsatisfactorily on unseen in-the-wild environments. Though weakly-supervised models have been proposed to address this shortcoming, performance of such models relies on availability of paired supervision on some related tasks, such as 2D pose or multi-view image pairs. In contrast, we propose a novel kinematic-structure-preserved unsupervised 3D pose estimation framework, which is not restrained by any paired or unpaired weak supervisions. Our pose estimation framework relies on a minimal set of prior knowledge that defines the underlying kinematic 3D structure, such as skeletal joint connectivity information with bone-length ratios in a fixed canonical scale. The proposed model employs three consecutive differentiable transformations named as forward-kinematics, camera-projection and spatial-map transformation. This design not only acts as a suitable bottleneck stimulating effective pose disentanglement but also yields interpretable latent pose representations avoiding training of an explicit latent embedding to pose mapper. Furthermore, devoid of unstable adversarial setup, we re-utilize the decoder to formalize an energy-based loss, which enables us to learn from in-the-wild videos, beyond laboratory settings. Comprehensive experiments demonstrate our state-of-the-art unsupervised and weakly-supervised pose estimation performance on both Human3.6M and MPI-INF-3DHP datasets. Qualitative results on unseen environments further establish our superior generalization ability.

Results

TaskDatasetMetricValueModel
3D ReconstructionMPI-INF-3DHPAUC43.4Kinematic-Structure-Preserved Representation
3D ReconstructionMPI-INF-3DHPMPJPE99.2Kinematic-Structure-Preserved Representation
3D ReconstructionMPI-INF-3DHPPCK79.2Kinematic-Structure-Preserved Representation
3D ReconstructionHuman3.6MPA-MPJPE89.4Kinematic-Structure-Preserved Representation
3DMPI-INF-3DHPAUC43.4Kinematic-Structure-Preserved Representation
3DMPI-INF-3DHPMPJPE99.2Kinematic-Structure-Preserved Representation
3DMPI-INF-3DHPPCK79.2Kinematic-Structure-Preserved Representation
3DHuman3.6MPA-MPJPE89.4Kinematic-Structure-Preserved Representation

Related Papers

CSD-VAR: Content-Style Decomposition in Visual Autoregressive Models2025-07-18$π^3$: Scalable Permutation-Equivariant Visual Geometry Learning2025-07-17Revisiting Reliability in the Reasoning-based Pose Estimation Benchmark2025-07-17DINO-VO: A Feature-based Visual Odometry Leveraging a Visual Foundation Model2025-07-17From Neck to Head: Bio-Impedance Sensing for Head Pose Estimation2025-07-17AthleticsPose: Authentic Sports Motion Dataset on Athletic Field and Evaluation of Monocular 3D Pose Estimation Ability2025-07-17SpatialTrackerV2: 3D Point Tracking Made Easy2025-07-16SGLoc: Semantic Localization System for Camera Pose Estimation from 3D Gaussian Splatting Representation2025-07-16