Tianhan Xu, Wataru Takano
In this paper, we propose a novel graph convolutional network architecture, Graph Stacked Hourglass Networks, for 2D-to-3D human pose estimation tasks. The proposed architecture consists of repeated encoder-decoder, in which graph-structured features are processed across three different scales of human skeletal representations. This multi-scale architecture enables the model to learn both local and global feature representations, which are critical for 3D human pose estimation. We also introduce a multi-level feature learning approach using different-depth intermediate features and show the performance improvements that result from exploiting multi-scale, multi-level feature representations. Extensive experiments are conducted to validate our approach, and the results show that our model outperforms the state-of-the-art.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| 3D Human Pose Estimation | MPI-INF-3DHP | AUC | 45.8 | Graph Stacked Hourglass Network |
| 3D Human Pose Estimation | MPI-INF-3DHP | PCK | 80.1 | Graph Stacked Hourglass Network |
| 3D Human Pose Estimation | Human3.6M | Average MPJPE (mm) | 51.9 | Graph Stacked Hourglass Network (CPN) |
| Pose Estimation | MPI-INF-3DHP | AUC | 45.8 | Graph Stacked Hourglass Network |
| Pose Estimation | MPI-INF-3DHP | PCK | 80.1 | Graph Stacked Hourglass Network |
| Pose Estimation | Human3.6M | Average MPJPE (mm) | 51.9 | Graph Stacked Hourglass Network (CPN) |
| 3D | MPI-INF-3DHP | AUC | 45.8 | Graph Stacked Hourglass Network |
| 3D | MPI-INF-3DHP | PCK | 80.1 | Graph Stacked Hourglass Network |
| 3D | Human3.6M | Average MPJPE (mm) | 51.9 | Graph Stacked Hourglass Network (CPN) |
| 1 Image, 2*2 Stitchi | MPI-INF-3DHP | AUC | 45.8 | Graph Stacked Hourglass Network |
| 1 Image, 2*2 Stitchi | MPI-INF-3DHP | PCK | 80.1 | Graph Stacked Hourglass Network |
| 1 Image, 2*2 Stitchi | Human3.6M | Average MPJPE (mm) | 51.9 | Graph Stacked Hourglass Network (CPN) |