Márton Véges, András Lőrincz
The common approach to 3D human pose estimation is predicting the body joint coordinates relative to the hip. This works well for a single person but is insufficient in the case of multiple interacting people. Methods predicting absolute coordinates first estimate a root-relative pose then calculate the translation via a secondary optimization task. We propose a neural network that predicts joints in a camera centered coordinate system instead of a root-relative one. Unlike previous methods, our network works in a single step without any post-processing. Our network beats previous methods on the MuPoTS-3D dataset and achieves state-of-the-art results.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| 3D Multi-Person Pose Estimation (root-relative) | MuPoTS-3D | MPJPE | 120 | Depth Prediction Network |
| 3D Human Pose Estimation | MuPoTS-3D | MPJPE | 292 | Depth Prediction Network |
| 3D Human Pose Estimation | MuPoTS-3D | MPJPE | 120 | Depth Prediction Network |
| 3D Multi-Person Pose Estimation (absolute) | MuPoTS-3D | MPJPE | 292 | Depth Prediction Network |
| Pose Estimation | MuPoTS-3D | MPJPE | 292 | Depth Prediction Network |
| Pose Estimation | MuPoTS-3D | MPJPE | 120 | Depth Prediction Network |
| 3D | MuPoTS-3D | MPJPE | 292 | Depth Prediction Network |
| 3D | MuPoTS-3D | MPJPE | 120 | Depth Prediction Network |
| 3D Multi-Person Pose Estimation | MuPoTS-3D | MPJPE | 292 | Depth Prediction Network |
| 3D Multi-Person Pose Estimation | MuPoTS-3D | MPJPE | 120 | Depth Prediction Network |
| 1 Image, 2*2 Stitchi | MuPoTS-3D | MPJPE | 292 | Depth Prediction Network |
| 1 Image, 2*2 Stitchi | MuPoTS-3D | MPJPE | 120 | Depth Prediction Network |