TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Structured Bird's-Eye-View Traffic Scene Understanding fro...

Structured Bird's-Eye-View Traffic Scene Understanding from Onboard Images

Yigit Baran Can, Alexander Liniger, Danda Pani Paudel, Luc van Gool

2021-10-05ICCV 2021 10Scene UnderstandingAutonomous NavigationLane Detection
PaperPDFCode(official)Code

Abstract

Autonomous navigation requires structured representation of the road network and instance-wise identification of the other traffic agents. Since the traffic scene is defined on the ground plane, this corresponds to scene understanding in the bird's-eye-view (BEV). However, the onboard cameras of autonomous cars are customarily mounted horizontally for a better view of the surrounding, making this task very challenging. In this work, we study the problem of extracting a directed graph representing the local road network in BEV coordinates, from a single onboard camera image. Moreover, we show that the method can be extended to detect dynamic objects on the BEV plane. The semantics, locations, and orientations of the detected objects together with the road graph facilitates a comprehensive understanding of the scene. Such understanding becomes fundamental for the downstream tasks, such as path planning and navigation. We validate our approach against powerful baselines and show that our network achieves superior performance. We also demonstrate the effects of various design choices through ablation studies. Code: https://github.com/ybarancan/STSU

Results

TaskDatasetMetricValueModel
Autonomous VehiclesnuScenesF1 score0.56STSU
Autonomous VehiclesnuScenesIoU0.389STSU
Lane DetectionnuScenesF1 score0.56STSU
Lane DetectionnuScenesIoU0.389STSU

Related Papers

Advancing Complex Wide-Area Scene Understanding with Hierarchical Coresets Selection2025-07-17Argus: Leveraging Multiview Images for Improved 3-D Scene Understanding With Large Language Models2025-07-17City-VLM: Towards Multidomain Perception Scene Understanding via Multimodal Incomplete Learning2025-07-17Evaluating Reinforcement Learning Algorithms for Navigation in Simulated Robotic Quadrupeds: A Comparative Study Inspired by Guide Dog Behaviour2025-07-17Vision-based Perception for Autonomous Vehicles in Obstacle Avoidance Scenarios2025-07-16Learning to Tune Like an Expert: Interpretable and Scene-Aware Navigation via MLLM Reasoning and CVAE-Based Adaptation2025-07-15Tactical Decision for Multi-UGV Confrontation with a Vision-Language Model-Based Commander2025-07-15Seeing the Signs: A Survey of Edge-Deployable OCR Models for Billboard Visibility Analysis2025-07-15