Embracing Events and Frames with Hierarchical Feature Refinement Network for Object Detection

Hu Cao, Zehua Zhang, Yan Xia, Xinyi Li, Jiahao Xia, Guang Chen, Alois Knoll

2024-07-17object-detection Object Detection

Abstract

In frame-based vision, object detection faces substantial performance degradation under challenging conditions due to the limited sensing capability of conventional cameras. Event cameras output sparse and asynchronous events, providing a potential solution to solve these problems. However, effectively fusing two heterogeneous modalities remains an open issue. In this work, we propose a novel hierarchical feature refinement network for event-frame fusion. The core concept is the design of the coarse-to-fine fusion module, denoted as the cross-modality adaptive feature refinement (CAFR) module. In the initial phase, the bidirectional cross-modality interaction (BCI) part facilitates information bridging from two distinct sources. Subsequently, the features are further refined by aligning the channel-level mean and variance in the two-fold adaptive feature refinement (TAFR) part. We conducted extensive experiments on two benchmarks: the low-resolution PKU-DDD17-Car dataset and the high-resolution DSEC dataset. Experimental results show that our method surpasses the state-of-the-art by an impressive margin of $\textbf{8.0}\%$ on the DSEC dataset. Besides, our method exhibits significantly better robustness (\textbf{69.5}\% versus \textbf{38.7}\%) when introducing 15 different corruption types to the frame images. The code can be found at the link (https://github.com/HuCaoFighting/FRN).

Results

Task	Dataset	Metric	Value	Model
Object Detection	DSEC	mAP	38	CAFR
Object Detection	PKU-DDD17-Car	mAP50	86.7	CAFR
3D	DSEC	mAP	38	CAFR
3D	PKU-DDD17-Car	mAP50	86.7	CAFR
2D Classification	DSEC	mAP	38	CAFR
2D Classification	PKU-DDD17-Car	mAP50	86.7	CAFR
2D Object Detection	DSEC	mAP	38	CAFR
2D Object Detection	PKU-DDD17-Car	mAP50	86.7	CAFR
16k	DSEC	mAP	38	CAFR
16k	PKU-DDD17-Car	mAP50	86.7	CAFR

Related Papers

A Real-Time System for Egocentric Hand-Object Interaction Detection in Industrial Domains2025-07-17 RS-TinyNet: Stage-wise Feature Fusion Network for Detecting Tiny Objects in Remote Sensing Images2025-07-17 Decoupled PROB: Decoupled Query Initialization Tasks and Objectness-Class Learning for Open World Object Detection2025-07-17 Dual LiDAR-Based Traffic Movement Count Estimation at a Signalized Intersection: Deployment, Data Collection, and Preliminary Analysis2025-07-17 Vision-based Perception for Autonomous Vehicles in Obstacle Avoidance Scenarios2025-07-16 Tomato Multi-Angle Multi-Pose Dataset for Fine-Grained Phenotyping2025-07-15 ECORE: Energy-Conscious Optimized Routing for Deep Learning Models at the Edge2025-07-08 Beyond One Shot, Beyond One Perspective: Cross-View and Long-Horizon Distillation for Better LiDAR Representations2025-07-07