PoseNet3D: Learning Temporally Consistent 3D Human Pose via Knowledge Distillation

Shashank Tripathi, Siddhant Ranade, Ambrish Tyagi, Amit Agrawal

2020-03-073D Human Pose Estimation Pose Estimation Knowledge Distillation

Abstract

Recovering 3D human pose from 2D joints is a highly unconstrained problem. We propose a novel neural network framework, PoseNet3D, that takes 2D joints as input and outputs 3D skeletons and SMPL body model parameters. By casting our learning approach in a student-teacher framework, we avoid using any 3D data such as paired/unpaired 3D data, motion capture sequences, depth images or multi-view images during training. We first train a teacher network that outputs 3D skeletons, using only 2D poses for training. The teacher network distills its knowledge to a student network that predicts 3D pose in SMPL representation. Finally, both the teacher and the student networks are jointly fine-tuned in an end-to-end manner using temporal, self-consistency and adversarial losses, improving the accuracy of each individual network. Results on Human3.6M dataset for 3D human pose estimation demonstrate that our approach reduces the 3D joint prediction error by 18% compared to previous unsupervised methods. Qualitative results on in-the-wild datasets show that the recovered 3D poses and meshes are natural, realistic, and flow smoothly over consecutive frames.

Results

Task	Dataset	Metric	Value	Model
3D Human Pose Estimation	3DPW	PA-MPJPE	63.2	PoseNet3D
Pose Estimation	3DPW	PA-MPJPE	63.2	PoseNet3D
3D	3DPW	PA-MPJPE	63.2	PoseNet3D
1 Image, 2*2 Stitchi	3DPW	PA-MPJPE	63.2	PoseNet3D

Related Papers

Visual-Language Model Knowledge Distillation Method for Image Quality Assessment2025-07-21 $π^3$: Scalable Permutation-Equivariant Visual Geometry Learning2025-07-17 Revisiting Reliability in the Reasoning-based Pose Estimation Benchmark2025-07-17 DINO-VO: A Feature-based Visual Odometry Leveraging a Visual Foundation Model2025-07-17 From Neck to Head: Bio-Impedance Sensing for Head Pose Estimation2025-07-17 AthleticsPose: Authentic Sports Motion Dataset on Athletic Field and Evaluation of Monocular 3D Pose Estimation Ability2025-07-17 Uncertainty-Aware Cross-Modal Knowledge Distillation with Prototype Learning for Multimodal Brain-Computer Interfaces2025-07-17 SpatialTrackerV2: 3D Point Tracking Made Easy2025-07-16