Zixing Wang, Ahmed H. Qureshi
Anytime 3D human pose forecasting is crucial to synchronous real-world human-machine interaction, where the term ``anytime" corresponds to predicting human pose at any real-valued time step. However, to the best of our knowledge, all the existing methods in human pose forecasting perform predictions at preset, discrete time intervals. Therefore, we introduce AnyPose, a lightweight continuous-time neural architecture that models human behavior dynamics with neural ordinary differential equations. We validate our framework on the Human3.6M, AMASS, and 3DPW dataset and conduct a series of comprehensive analyses towards comparison with existing methods and the intersection of human pose and neural ordinary differential equations. Our results demonstrate that AnyPose exhibits high-performance accuracy in predicting future poses and takes significantly lower computational time than traditional methods in solving anytime prediction tasks.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Pose Estimation | AMASS | Average MPJPE (mm) 1000 msec | 91.7 | AnyPose1 |
| Pose Estimation | Human3.6M | Average MPJPE (mm) @ 1000 ms | 128.2 | AnyPose1 |
| Pose Estimation | Human3.6M | Average MPJPE (mm) @ 400ms | 80.6 | AnyPose1 |
| Pose Estimation | 3DPW | Average MPJPE (mm) 1000 msec | 84.4 | AnyPose1 |
| 3D | AMASS | Average MPJPE (mm) 1000 msec | 91.7 | AnyPose1 |
| 3D | Human3.6M | Average MPJPE (mm) @ 1000 ms | 128.2 | AnyPose1 |
| 3D | Human3.6M | Average MPJPE (mm) @ 400ms | 80.6 | AnyPose1 |
| 3D | 3DPW | Average MPJPE (mm) 1000 msec | 84.4 | AnyPose1 |
| 1 Image, 2*2 Stitchi | AMASS | Average MPJPE (mm) 1000 msec | 91.7 | AnyPose1 |
| 1 Image, 2*2 Stitchi | Human3.6M | Average MPJPE (mm) @ 1000 ms | 128.2 | AnyPose1 |
| 1 Image, 2*2 Stitchi | Human3.6M | Average MPJPE (mm) @ 400ms | 80.6 | AnyPose1 |
| 1 Image, 2*2 Stitchi | 3DPW | Average MPJPE (mm) 1000 msec | 84.4 | AnyPose1 |