TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/SA-BEV: Generating Semantic-Aware Bird's-Eye-View Feature ...

SA-BEV: Generating Semantic-Aware Bird's-Eye-View Feature for Multi-view 3D Object Detection

Jinqing Zhang, Yanan Zhang, Qingjie Liu, Yunhong Wang

2023-07-21ICCV 2023 13D Object Detection
PaperPDFCode(official)

Abstract

Recently, the pure camera-based Bird's-Eye-View (BEV) perception provides a feasible solution for economical autonomous driving. However, the existing BEV-based multi-view 3D detectors generally transform all image features into BEV features, without considering the problem that the large proportion of background information may submerge the object information. In this paper, we propose Semantic-Aware BEV Pooling (SA-BEVPool), which can filter out background information according to the semantic segmentation of image features and transform image features into semantic-aware BEV features. Accordingly, we propose BEV-Paste, an effective data augmentation strategy that closely matches with semantic-aware BEV feature. In addition, we design a Multi-Scale Cross-Task (MSCT) head, which combines task-specific and cross-task information to predict depth distribution and semantic segmentation more accurately, further improving the quality of semantic-aware BEV feature. Finally, we integrate the above modules into a novel multi-view 3D object detection framework, namely SA-BEV. Experiments on nuScenes show that SA-BEV achieves state-of-the-art performance. Code has been available at https://github.com/mengtan00/SA-BEV.git.

Results

TaskDatasetMetricValueModel
Object DetectionnuScenes Camera OnlyNDS62.4SA-BEV
3DnuScenes Camera OnlyNDS62.4SA-BEV
3D Object DetectionnuScenes Camera OnlyNDS62.4SA-BEV
2D ClassificationnuScenes Camera OnlyNDS62.4SA-BEV
2D Object DetectionnuScenes Camera OnlyNDS62.4SA-BEV
16knuScenes Camera OnlyNDS62.4SA-BEV

Related Papers

Dual LiDAR-Based Traffic Movement Count Estimation at a Signalized Intersection: Deployment, Data Collection, and Preliminary Analysis2025-07-17Beyond One Shot, Beyond One Perspective: Cross-View and Long-Horizon Distillation for Better LiDAR Representations2025-07-07MambaFusion: Height-Fidelity Dense Global Fusion for Multi-modal 3D Object Detection2025-07-06A Survey of Multi-sensor Fusion Perception for Embodied AI: Background, Methods, Challenges and Prospects2025-06-24Teleoperated Driving: a New Challenge for 3D Object Detection in Compressed Point Clouds2025-06-13Vision-based Lifting of 2D Object Detections for Automated Driving2025-06-13DySS: Dynamic Queries and State-Space Learning for Efficient 3D Object Detection from Multi-Camera Videos2025-06-11Gaussian2Scene: 3D Scene Representation Learning via Self-supervised Learning with 3D Gaussian Splatting2025-06-10