Borui Wang, Ehsan Adeli, Hsu-kuang Chiu, De-An Huang, Juan Carlos Niebles
Modeling and prediction of human motion dynamics has long been a challenging problem in computer vision, and most existing methods rely on the end-to-end supervised training of various architectures of recurrent neural networks. Inspired by the recent success of deep reinforcement learning methods, in this paper we propose a new reinforcement learning formulation for the problem of human pose prediction, and develop an imitation learning algorithm for predicting future poses under this formulation through a combination of behavioral cloning and generative adversarial imitation learning. Our experiments show that our proposed method outperforms all existing state-of-the-art baseline models by large margins on the task of human pose prediction in both short-term predictions and long-term predictions, while also enjoying huge advantage in training speed.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Pose Estimation | Human3.6M | MAR, walking, 1,000ms | 0.69 | BC+WGAIL-div |
| Pose Estimation | Human3.6M | MAR, walking, 400ms | 0.59 | BC+WGAIL-div |
| 3D | Human3.6M | MAR, walking, 1,000ms | 0.69 | BC+WGAIL-div |
| 3D | Human3.6M | MAR, walking, 400ms | 0.59 | BC+WGAIL-div |
| 1 Image, 2*2 Stitchi | Human3.6M | MAR, walking, 1,000ms | 0.69 | BC+WGAIL-div |
| 1 Image, 2*2 Stitchi | Human3.6M | MAR, walking, 400ms | 0.59 | BC+WGAIL-div |