Karttikeya Mangalam, Yang An, Harshayu Girase, Jitendra Malik
Human trajectory forecasting is an inherently multi-modal problem. Uncertainty in future trajectories stems from two sources: (a) sources that are known to the agent but unknown to the model, such as long term goals and (b)sources that are unknown to both the agent & the model, such as intent of other agents & irreducible randomness indecisions. We propose to factorize this uncertainty into its epistemic & aleatoric sources. We model the epistemic un-certainty through multimodality in long term goals and the aleatoric uncertainty through multimodality in waypoints& paths. To exemplify this dichotomy, we also propose a novel long term trajectory forecasting setting, with prediction horizons upto a minute, an order of magnitude longer than prior works. Finally, we presentY-net, a scene com-pliant trajectory forecasting network that exploits the pro-posed epistemic & aleatoric structure for diverse trajectory predictions across long prediction horizons.Y-net significantly improves previous state-of-the-art performance on both (a) The well studied short prediction horizon settings on the Stanford Drone & ETH/UCY datasets and (b) The proposed long prediction horizon setting on the re-purposed Stanford Drone & Intersection Drone datasets.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Trajectory Prediction | ETH/UCY | ADE-8/12 | 0.18 | Y-Net |
| Trajectory Prediction | ETH/UCY | FDE-8/12 | 0.27 | Y-Net |
| Trajectory Prediction | Stanford Drone | ADE-8/12 @K = 20 | 7.85 | Y-Net |
| Trajectory Prediction | Stanford Drone | FDE-8/12 @K= 20 | 11.85 | Y-Net |