TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/HRFuser: A Multi-resolution Sensor Fusion Architecture for...

HRFuser: A Multi-resolution Sensor Fusion Architecture for 2D Object Detection

Tim Broedermann, Christos Sakaridis, Dengxin Dai, Luc van Gool

2022-06-30Autonomous VehiclesSensor FusionSemantic Segmentation2D Object Detectionobject-detection3D Object DetectionObject Detection
PaperPDFCode(official)

Abstract

Besides standard cameras, autonomous vehicles typically include multiple additional sensors, such as lidars and radars, which help acquire richer information for perceiving the content of the driving scene. While several recent works focus on fusing certain pairs of sensors - such as camera with lidar or radar - by using architectural components specific to the examined setting, a generic and modular sensor fusion architecture is missing from the literature. In this work, we propose HRFuser, a modular architecture for multi-modal 2D object detection. It fuses multiple sensors in a multi-resolution fashion and scales to an arbitrary number of input modalities. The design of HRFuser is based on state-of-the-art high-resolution networks for image-only dense prediction and incorporates a novel multi-window cross-attention block as the means to perform fusion of multiple modalities at multiple resolutions. We demonstrate via extensive experiments on nuScenes and the adverse conditions DENSE datasets that our model effectively leverages complementary features from additional modalities, substantially improving upon camera-only performance and consistently outperforming state-of-the-art 3D and 2D fusion methods evaluated on 2D object detection metrics. The source code is publicly available.

Results

TaskDatasetMetricValueModel
Semantic SegmentationKITTI-360mIoU52.61HRFuser (RGB-D-LiDAR)
Semantic SegmentationKITTI-360mIoU49.32HRFuser (RGB-Depth)
Semantic SegmentationKITTI-360mIoU48.74HRFuser (RGB-LiDAR)
Semantic SegmentationDeLiVER mIoU52.97HRFuser (RGB-D-E-Li)
Semantic SegmentationDeLiVER mIoU52.72HRFuser (RGB-D-LiDAR)
Semantic SegmentationDeLiVER mIoU51.88HRFuser (RGB-Depth)
Semantic SegmentationDeLiVER mIoU51.83HRFuser (RGB-D-Event)
Semantic SegmentationDeLiVER mIoU47.95HRFuser (RGB)
Semantic SegmentationDeLiVER mIoU43.13HRFuser (RGB-LiDAR)
Semantic SegmentationDeLiVER mIoU42.22HRFuser (RGB-Event)
Object DetectionInOutDoor AP58.6HRFuser
Object DetectionEventPedAP46HRFuser
Object DetectionSTCrowdAP49HRFuser
3DInOutDoor AP58.6HRFuser
3DEventPedAP46HRFuser
3DSTCrowdAP49HRFuser
2D ClassificationInOutDoor AP58.6HRFuser
2D ClassificationEventPedAP46HRFuser
2D ClassificationSTCrowdAP49HRFuser
2D Object DetectionDense Fogdense fog hard (AP)78.21HRFuser-T
2D Object DetectionDense Foglight fog hard (AP)86.5HRFuser-T
2D Object DetectionDense Fogsnow/rain hard (AP)78.09HRFuser-T
2D Object DetectionClear Weatherclear hard (AP)79.48HRFuser-T
2D Object DetectionInOutDoor AP58.6HRFuser
2D Object DetectionEventPedAP46HRFuser
2D Object DetectionSTCrowdAP49HRFuser
10-shot image generationKITTI-360mIoU52.61HRFuser (RGB-D-LiDAR)
10-shot image generationKITTI-360mIoU49.32HRFuser (RGB-Depth)
10-shot image generationKITTI-360mIoU48.74HRFuser (RGB-LiDAR)
10-shot image generationDeLiVER mIoU52.97HRFuser (RGB-D-E-Li)
10-shot image generationDeLiVER mIoU52.72HRFuser (RGB-D-LiDAR)
10-shot image generationDeLiVER mIoU51.88HRFuser (RGB-Depth)
10-shot image generationDeLiVER mIoU51.83HRFuser (RGB-D-Event)
10-shot image generationDeLiVER mIoU47.95HRFuser (RGB)
10-shot image generationDeLiVER mIoU43.13HRFuser (RGB-LiDAR)
10-shot image generationDeLiVER mIoU42.22HRFuser (RGB-Event)
16kInOutDoor AP58.6HRFuser
16kEventPedAP46HRFuser
16kSTCrowdAP49HRFuser

Related Papers

SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction2025-07-21DiffOSeg: Omni Medical Image Segmentation via Multi-Expert Collaboration Diffusion Model2025-07-17SCORE: Scene Context Matters in Open-Vocabulary Remote Sensing Instance Segmentation2025-07-17Unified Medical Image Segmentation with State Space Modeling Snake2025-07-17A Privacy-Preserving Semantic-Segmentation Method Using Domain-Adaptation Technique2025-07-17A Real-Time System for Egocentric Hand-Object Interaction Detection in Industrial Domains2025-07-17RS-TinyNet: Stage-wise Feature Fusion Network for Detecting Tiny Objects in Remote Sensing Images2025-07-17Decoupled PROB: Decoupled Query Initialization Tasks and Objectness-Class Learning for Open World Object Detection2025-07-17