Yu Yao, Ella Atkins, Matthew Johnson-Roberson, Ram Vasudevan, Xiaoxiao Du
Pedestrian trajectory prediction is an essential task in robotic applications such as autonomous driving and robot navigation. State-of-the-art trajectory predictors use a conditional variational autoencoder (CVAE) with recurrent neural networks (RNNs) to encode observed trajectories and decode multi-modal future trajectories. This process can suffer from accumulated errors over long prediction horizons (>=2 seconds). This paper presents BiTraP, a goal-conditioned bi-directional multi-modal trajectory prediction method based on the CVAE. BiTraP estimates the goal (end-point) of trajectories and introduces a novel bi-directional decoder to improve longer-term trajectory prediction accuracy. Extensive experiments show that BiTraP generalizes to both first-person view (FPV) and bird's-eye view (BEV) scenarios and outperforms state-of-the-art results by ~10-50%. We also show that different choices of non-parametric versus parametric target models in the CVAE directly influence the predicted multi-modal trajectory distributions. These results provide guidance on trajectory predictor design for robotic applications such as collision avoidance and navigation systems.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Trajectory Prediction | JAAD | CF_MSE(1.5) | 4565 | BiTrap-D |
| Trajectory Prediction | JAAD | C_MSE(1.5) | 1105 | BiTrap-D |
| Trajectory Prediction | JAAD | MSE(0.5) | 93 | BiTrap-D |
| Trajectory Prediction | JAAD | MSE(1.0) | 378 | BiTrap-D |
| Trajectory Prediction | JAAD | MSE(1.5) | 1206 | BiTrap-D |
| Trajectory Prediction | PIE | CF_MSE(1.5) | 1949 | Bitrap-D |
| Trajectory Prediction | PIE | C_MSE(1.5) | 481 | Bitrap-D |
| Trajectory Prediction | PIE | MSE(0.5) | 41 | Bitrap-D |
| Trajectory Prediction | PIE | MSE(1.0) | 161 | Bitrap-D |
| Trajectory Prediction | PIE | MSE(1.5) | 511 | Bitrap-D |