FIERY: Future Instance Prediction in Bird's-Eye View from Surround Monocular Cameras

Anthony Hu, Zak Murez, Nikhil Mohan, Sofía Dudas, Jeffrey Hawke, Vijay Badrinarayanan, Roberto Cipolla, Alex Kendall

2021-04-21ICCV 2021 10Sensor Fusion Navigate Future prediction Autonomous Driving Semantic Segmentation Prediction Bird's-Eye View Semantic Segmentation Instance Segmentation

Paper PDF Code(official)

Abstract

Driving requires interacting with road agents and predicting their future behaviour in order to navigate safely. We present FIERY: a probabilistic future prediction model in bird's-eye view from monocular cameras. Our model predicts future instance segmentation and motion of dynamic agents that can be transformed into non-parametric future trajectories. Our approach combines the perception, sensor fusion and prediction components of a traditional autonomous driving stack by estimating bird's-eye-view prediction directly from surround RGB monocular camera inputs. FIERY learns to model the inherent stochastic nature of the future solely from camera driving data in an end-to-end manner, without relying on HD maps, and predicts multimodal future trajectories. We show that our model outperforms previous prediction baselines on the NuScenes and Lyft datasets. The code and trained models are available at https://github.com/wayveai/fiery.

Results

Task	Dataset	Metric	Value	Model
Semantic Segmentation	nuScenes	IoU ped - 224x480 - Vis filter. - 100x100 at 0.5	17.2	FIERY (static)
Semantic Segmentation	nuScenes	IoU veh - 224x480 - No vis filter - 100x100 at 0.5	35.8	FIERY (static)
Semantic Segmentation	nuScenes	IoU veh - 224x480 - Vis filter. - 100x100 at 0.5	39.8	FIERY (static)
Semantic Segmentation	nuScenes	IoU veh - 224x480 - No vis filter - 100x100 at 0.5	38.2	FIERY
Semantic Segmentation	nuScenes	IoU veh - 224x480 - No vis filter - 100x50 at 0.25	41.1	FIERY
Semantic Segmentation	nuScenes	IoU vehicle - Setting 3	58.5	FIERY
Semantic Segmentation	Lyft Level 5	IoU vehicle - 224x480 - Long	36.7	FIERY
Semantic Segmentation	Lyft Level 5	IoU vehicle - 224x480 - Short	59.4	FIERY
10-shot image generation	nuScenes	IoU ped - 224x480 - Vis filter. - 100x100 at 0.5	17.2	FIERY (static)
10-shot image generation	nuScenes	IoU veh - 224x480 - No vis filter - 100x100 at 0.5	35.8	FIERY (static)
10-shot image generation	nuScenes	IoU veh - 224x480 - Vis filter. - 100x100 at 0.5	39.8	FIERY (static)
10-shot image generation	nuScenes	IoU veh - 224x480 - No vis filter - 100x100 at 0.5	38.2	FIERY
10-shot image generation	nuScenes	IoU veh - 224x480 - No vis filter - 100x50 at 0.25	41.1	FIERY
10-shot image generation	nuScenes	IoU vehicle - Setting 3	58.5	FIERY
10-shot image generation	Lyft Level 5	IoU vehicle - 224x480 - Long	36.7	FIERY
10-shot image generation	Lyft Level 5	IoU vehicle - 224x480 - Short	59.4	FIERY
Bird's-Eye View Semantic Segmentation	nuScenes	IoU ped - 224x480 - Vis filter. - 100x100 at 0.5	17.2	FIERY (static)
Bird's-Eye View Semantic Segmentation	nuScenes	IoU veh - 224x480 - No vis filter - 100x100 at 0.5	35.8	FIERY (static)
Bird's-Eye View Semantic Segmentation	nuScenes	IoU veh - 224x480 - Vis filter. - 100x100 at 0.5	39.8	FIERY (static)
Bird's-Eye View Semantic Segmentation	nuScenes	IoU veh - 224x480 - No vis filter - 100x100 at 0.5	38.2	FIERY
Bird's-Eye View Semantic Segmentation	nuScenes	IoU veh - 224x480 - No vis filter - 100x50 at 0.25	41.1	FIERY
Bird's-Eye View Semantic Segmentation	nuScenes	IoU vehicle - Setting 3	58.5	FIERY
Bird's-Eye View Semantic Segmentation	Lyft Level 5	IoU vehicle - 224x480 - Long	36.7	FIERY
Bird's-Eye View Semantic Segmentation	Lyft Level 5	IoU vehicle - 224x480 - Short	59.4	FIERY

FIERY: Future Instance Prediction in Bird's-Eye View from Surround Monocular Cameras

Abstract

Results

Related Papers

FIERY: Future Instance Prediction in Bird's-Eye View from Surround Monocular Cameras

Abstract

Results

Related Papers