TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Multi-Modal 3D Object Detection by Box Matching

Multi-Modal 3D Object Detection by Box Matching

Zhe Liu, Xiaoqing Ye, Zhikang Zou, Xinwei He, Xiao Tan, Errui Ding, Jingdong Wang, Xiang Bai

2023-05-12Autonomous Drivingobject-detection3D Object DetectionObject Detection
PaperPDFCode(official)

Abstract

Multi-modal 3D object detection has received growing attention as the information from different sensors like LiDAR and cameras are complementary. Most fusion methods for 3D detection rely on an accurate alignment and calibration between 3D point clouds and RGB images. However, such an assumption is not reliable in a real-world self-driving system, as the alignment between different modalities is easily affected by asynchronous sensors and disturbed sensor placement. We propose a novel {F}usion network by {B}ox {M}atching (FBMNet) for multi-modal 3D detection, which provides an alternative way for cross-modal feature alignment by learning the correspondence at the bounding box level to free up the dependency of calibration during inference. With the learned assignments between 3D and 2D object proposals, the fusion for detection can be effectively performed by combing their ROI features. Extensive experiments on the nuScenes dataset demonstrate that our method is much more stable in dealing with challenging cases such as asynchronous sensors, misaligned sensor placement, and degenerated camera images than existing fusion methods. We hope that our FBMNet could provide an available solution to dealing with these challenging cases for safety in real autonomous driving scenarios. Codes will be publicly available at https://github.com/happinesslz/FBMNet.

Results

TaskDatasetMetricValueModel
Object DetectionnuScenesNDS0.721FBMNet (Ours)
Object DetectionnuScenesmAP0.689FBMNet (Ours)
3DnuScenesNDS0.721FBMNet (Ours)
3DnuScenesmAP0.689FBMNet (Ours)
3D Object DetectionnuScenesNDS0.721FBMNet (Ours)
3D Object DetectionnuScenesmAP0.689FBMNet (Ours)
2D ClassificationnuScenesNDS0.721FBMNet (Ours)
2D ClassificationnuScenesmAP0.689FBMNet (Ours)
2D Object DetectionnuScenesNDS0.721FBMNet (Ours)
2D Object DetectionnuScenesmAP0.689FBMNet (Ours)
16knuScenesNDS0.721FBMNet (Ours)
16knuScenesmAP0.689FBMNet (Ours)

Related Papers

GEMINUS: Dual-aware Global and Scene-Adaptive Mixture-of-Experts for End-to-End Autonomous Driving2025-07-19AGENTS-LLM: Augmentative GENeration of Challenging Traffic Scenarios with an Agentic LLM Framework2025-07-18World Model-Based End-to-End Scene Generation for Accident Anticipation in Autonomous Driving2025-07-17Orbis: Overcoming Challenges of Long-Horizon Prediction in Driving World Models2025-07-17Channel-wise Motion Features for Efficient Motion Segmentation2025-07-17LaViPlan : Language-Guided Visual Path Planning with RLVR2025-07-17A Real-Time System for Egocentric Hand-Object Interaction Detection in Industrial Domains2025-07-17RS-TinyNet: Stage-wise Feature Fusion Network for Detecting Tiny Objects in Remote Sensing Images2025-07-17