3D human pose estimation in video with temporal convolutions and semi-supervised training

Dario Pavllo, Christoph Feichtenhofer, David Grangier, Michael Auli

2018-11-28CVPR 2019 63D Human Pose Estimation Weakly-supervised 3D Human Pose Estimation Monocular 3D Human Pose Estimation Pose Estimation

Paper PDF Code Code Code(official)Code Code Code Code Code Code Code

Abstract

In this work, we demonstrate that 3D poses in video can be effectively estimated with a fully convolutional model based on dilated temporal convolutions over 2D keypoints. We also introduce back-projection, a simple and effective semi-supervised training method that leverages unlabeled video data. We start with predicted 2D keypoints for unlabeled video, then estimate 3D poses and finally back-project to the input 2D keypoints. In the supervised setting, our fully-convolutional model outperforms the previous best result from the literature by 6 mm mean per-joint position error on Human3.6M, corresponding to an error reduction of 11%, and the model also shows significant improvements on HumanEva-I. Moreover, experiments with back-projection show that it comfortably outperforms previous state-of-the-art results in semi-supervised settings where labeled data is scarce. Code and models are available at https://github.com/facebookresearch/VideoPose3D

Results

Task	Dataset	Metric	Value	Model
3D Human Pose Estimation	Human3.6M	Average MPJPE (mm)	46.8	VideoPose3D (T=243)
3D Human Pose Estimation	Human3.6M	PA-MPJPE	36.5	VideoPose3D (T=243)
3D Human Pose Estimation	Human3.6M	Average MPJPE (mm)	51.8	VideoPose3D (T=1)
3D Human Pose Estimation	Human3.6M	PA-MPJPE	40	VideoPose3D (T=1)
3D Human Pose Estimation	Human3.6M	Average MPJPE (mm)	46.8	VideoPose3D (T=243)
3D Human Pose Estimation	Human3.6M	Frames Needed	243	VideoPose3D (T=243)
3D Human Pose Estimation	Human3.6M	Average MPJPE (mm)	64.7	Pavllo et al.
3D Human Pose Estimation	Human3.6M	Number of Views	1	Pavllo et al.
3D Human Pose Estimation	Human3.6M	Number of Frames Per View	243	VideoPose3D (T=243)
Pose Estimation	Human3.6M	Average MPJPE (mm)	46.8	VideoPose3D (T=243)
Pose Estimation	Human3.6M	PA-MPJPE	36.5	VideoPose3D (T=243)
Pose Estimation	Human3.6M	Average MPJPE (mm)	51.8	VideoPose3D (T=1)
Pose Estimation	Human3.6M	PA-MPJPE	40	VideoPose3D (T=1)
Pose Estimation	Human3.6M	Average MPJPE (mm)	46.8	VideoPose3D (T=243)
Pose Estimation	Human3.6M	Frames Needed	243	VideoPose3D (T=243)
Pose Estimation	Human3.6M	Average MPJPE (mm)	64.7	Pavllo et al.
Pose Estimation	Human3.6M	Number of Views	1	Pavllo et al.
Pose Estimation	Human3.6M	Number of Frames Per View	243	VideoPose3D (T=243)
3D	Human3.6M	Average MPJPE (mm)	46.8	VideoPose3D (T=243)
3D	Human3.6M	PA-MPJPE	36.5	VideoPose3D (T=243)
3D	Human3.6M	Average MPJPE (mm)	51.8	VideoPose3D (T=1)
3D	Human3.6M	PA-MPJPE	40	VideoPose3D (T=1)
3D	Human3.6M	Average MPJPE (mm)	46.8	VideoPose3D (T=243)
3D	Human3.6M	Frames Needed	243	VideoPose3D (T=243)
3D	Human3.6M	Average MPJPE (mm)	64.7	Pavllo et al.
3D	Human3.6M	Number of Views	1	Pavllo et al.
3D	Human3.6M	Number of Frames Per View	243	VideoPose3D (T=243)
1 Image, 2*2 Stitchi	Human3.6M	Average MPJPE (mm)	46.8	VideoPose3D (T=243)
1 Image, 2*2 Stitchi	Human3.6M	PA-MPJPE	36.5	VideoPose3D (T=243)
1 Image, 2*2 Stitchi	Human3.6M	Average MPJPE (mm)	51.8	VideoPose3D (T=1)
1 Image, 2*2 Stitchi	Human3.6M	PA-MPJPE	40	VideoPose3D (T=1)
1 Image, 2*2 Stitchi	Human3.6M	Average MPJPE (mm)	46.8	VideoPose3D (T=243)
1 Image, 2*2 Stitchi	Human3.6M	Frames Needed	243	VideoPose3D (T=243)
1 Image, 2*2 Stitchi	Human3.6M	Average MPJPE (mm)	64.7	Pavllo et al.
1 Image, 2*2 Stitchi	Human3.6M	Number of Views	1	Pavllo et al.
1 Image, 2*2 Stitchi	Human3.6M	Number of Frames Per View	243	VideoPose3D (T=243)

3D human pose estimation in video with temporal convolutions and semi-supervised training

Abstract

Results

Related Papers

3D human pose estimation in video with temporal convolutions and semi-supervised training

Abstract

Results

Related Papers