TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/CoBEV: Elevating Roadside 3D Object Detection with Depth a...

CoBEV: Elevating Roadside 3D Object Detection with Depth and Height Complementarity

Hao Shi, Chengshan Pang, Jiaming Zhang, Kailun Yang, Yuhao Wu, Huajian Ni, Yining Lin, Rainer Stiefelhagen, Kaiwei Wang

2023-10-04feature selectionMonocular 3D Object Detectionobject-detection3D Object DetectionObject Detection
PaperPDFCode(official)

Abstract

Roadside camera-driven 3D object detection is a crucial task in intelligent transportation systems, which extends the perception range beyond the limitations of vision-centric vehicles and enhances road safety. While previous studies have limitations in using only depth or height information, we find both depth and height matter and they are in fact complementary. The depth feature encompasses precise geometric cues, whereas the height feature is primarily focused on distinguishing between various categories of height intervals, essentially providing semantic context. This insight motivates the development of Complementary-BEV (CoBEV), a novel end-to-end monocular 3D object detection framework that integrates depth and height to construct robust BEV representations. In essence, CoBEV estimates each pixel's depth and height distribution and lifts the camera features into 3D space for lateral fusion using the newly proposed two-stage complementary feature selection (CFS) module. A BEV feature distillation framework is also seamlessly integrated to further enhance the detection accuracy from the prior knowledge of the fusion-modal CoBEV teacher. We conduct extensive experiments on the public 3D detection benchmarks of roadside camera-based DAIR-V2X-I and Rope3D, as well as the private Supremind-Road dataset, demonstrating that CoBEV not only achieves the accuracy of the new state-of-the-art, but also significantly advances the robustness of previous methods in challenging long-distance scenarios and noisy camera disturbance, and enhances generalization by a large margin in heterologous settings with drastic changes in scene and camera parameters. For the first time, the vehicle AP score of a camera model reaches 80% on DAIR-V2X-I in terms of easy mode. The source code will be made publicly available at https://github.com/MasterHow/CoBEV.

Results

TaskDatasetMetricValueModel
Object DetectionRope3DAP@0.752.72CoBEV
Object DetectionDAIR-V2X-IAP|R40(easy)82CoBEV
Object DetectionDAIR-V2X-IAP|R40(hard)69.7CoBEV
Object DetectionDAIR-V2X-IAP|R40(moderate)69.6CoBEV
3DRope3DAP@0.752.72CoBEV
3DDAIR-V2X-IAP|R40(easy)82CoBEV
3DDAIR-V2X-IAP|R40(hard)69.7CoBEV
3DDAIR-V2X-IAP|R40(moderate)69.6CoBEV
3D Object DetectionRope3DAP@0.752.72CoBEV
3D Object DetectionDAIR-V2X-IAP|R40(easy)82CoBEV
3D Object DetectionDAIR-V2X-IAP|R40(hard)69.7CoBEV
3D Object DetectionDAIR-V2X-IAP|R40(moderate)69.6CoBEV
2D ClassificationRope3DAP@0.752.72CoBEV
2D ClassificationDAIR-V2X-IAP|R40(easy)82CoBEV
2D ClassificationDAIR-V2X-IAP|R40(hard)69.7CoBEV
2D ClassificationDAIR-V2X-IAP|R40(moderate)69.6CoBEV
2D Object DetectionRope3DAP@0.752.72CoBEV
2D Object DetectionDAIR-V2X-IAP|R40(easy)82CoBEV
2D Object DetectionDAIR-V2X-IAP|R40(hard)69.7CoBEV
2D Object DetectionDAIR-V2X-IAP|R40(moderate)69.6CoBEV
16kRope3DAP@0.752.72CoBEV
16kDAIR-V2X-IAP|R40(easy)82CoBEV
16kDAIR-V2X-IAP|R40(hard)69.7CoBEV
16kDAIR-V2X-IAP|R40(moderate)69.6CoBEV

Related Papers

mNARX+: A surrogate model for complex dynamical systems using manifold-NARX and automatic feature selection2025-07-17A Real-Time System for Egocentric Hand-Object Interaction Detection in Industrial Domains2025-07-17RS-TinyNet: Stage-wise Feature Fusion Network for Detecting Tiny Objects in Remote Sensing Images2025-07-17Decoupled PROB: Decoupled Query Initialization Tasks and Objectness-Class Learning for Open World Object Detection2025-07-17Dual LiDAR-Based Traffic Movement Count Estimation at a Signalized Intersection: Deployment, Data Collection, and Preliminary Analysis2025-07-17Vision-based Perception for Autonomous Vehicles in Obstacle Avoidance Scenarios2025-07-16Interpretable Bayesian Tensor Network Kernel Machines with Automatic Rank and Feature Selection2025-07-15Tomato Multi-Angle Multi-Pose Dataset for Fine-Grained Phenotyping2025-07-15