Eldar Insafutdinov, Leonid Pishchulin, Bjoern Andres, Mykhaylo Andriluka, Bernt Schiele
The goal of this paper is to advance the state-of-the-art of articulated pose estimation in scenes with multiple people. To that end we contribute on three fronts. We propose (1) improved body part detectors that generate effective bottom-up proposals for body parts; (2) novel image-conditioned pairwise terms that allow to assemble the proposals into a variable number of consistent body part configurations; and (3) an incremental optimization strategy that explores the search space more efficiently thus leading both to better performance and significant speed-up factors. Evaluation is done on two single-person and two multi-person pose estimation benchmarks. The proposed approach significantly outperforms best known multi-person pose estimation results while demonstrating competitive performance on the task of single person pose estimation. Models and code available at http://pose.mpi-inf.mpg.de
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Pose Estimation | MPII Human Pose | PCKh-0.5 | 88.52 | ResNet-152 + intermediate supervision |
| 3D | MPII Human Pose | PCKh-0.5 | 88.52 | ResNet-152 + intermediate supervision |
| 1 Image, 2*2 Stitchi | MPII Human Pose | PCKh-0.5 | 88.52 | ResNet-152 + intermediate supervision |