TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Accurate and Real-time 3D Pedestrian Detection Using an Ef...

Accurate and Real-time 3D Pedestrian Detection Using an Efficient Attentive Pillar Network

Duy-Tho Le, Hengcan Shi, Hamid Rezatofighi, Jianfei Cai

2021-12-31Birds Eye View Object DetectionAutonomous DrivingPedestrian Detectionobject-detection3D Object DetectionObject Detection
PaperPDFCode(official)

Abstract

Efficiently and accurately detecting people from 3D point cloud data is of great importance in many robotic and autonomous driving applications. This fundamental perception task is still very challenging due to (i) significant deformations of human body pose and gesture over time and (ii) point cloud sparsity and scarcity for pedestrian class objects. Recent efficient 3D object detection approaches rely on pillar features to detect objects from point cloud data. However, these pillar features do not carry sufficient expressive representations to deal with all the aforementioned challenges in detecting people. To address this shortcoming, we first introduce a stackable Pillar Aware Attention (PAA) module for enhanced pillar features extraction while suppressing noises in the point clouds. By integrating multi-point-channel-pooling, point-wise, channel-wise, and task-aware attention into a simple module, the representation capabilities are boosted while requiring little additional computing resources. We also present Mini-BiFPN, a small yet effective feature network that creates bidirectional information flow and multi-level cross-scale feature fusion to better integrate multi-resolution features. Our proposed framework, namely PiFeNet, has been evaluated on three popular large-scale datasets for 3D pedestrian Detection, i.e. KITTI, JRDB, and nuScenes achieving state-of-the-art (SOTA) performance on KITTI Bird-eye-view (BEV) and JRDB and very competitive performance on nuScenes. Our approach has inference speed of 26 frame-per-second (FPS), making it a real-time detector. The code for our PiFeNet is available at https://github.com/ldtho/PiFeNet.

Results

TaskDatasetMetricValueModel
Object DetectionKITTI Pedestrian ModerateAverage Precision0.4671PiFeNet
Object DetectionKITTI Pedestrian HardAverage Precision0.4271PiFeNet
Object DetectionKITTI PedestrianmAP0.486PiFeNet
Object DetectionKITTI Pedestrian EasyAverage Precision0.5639PiFeNet
3DKITTI Pedestrian ModerateAverage Precision0.4671PiFeNet
3DKITTI Pedestrian HardAverage Precision0.4271PiFeNet
3DKITTI PedestrianmAP0.486PiFeNet
3DKITTI Pedestrian EasyAverage Precision0.5639PiFeNet
Birds Eye View Object DetectionKITTI Pedestrian EasyAverage Precision0.6325PiFeNet
Birds Eye View Object DetectionKITTI Pedestrian ModerateAverage Precision0.5392PiFeNet
Birds Eye View Object DetectionKITTI Pedestrian HardAverage Precision0.5053PiFeNet
Birds Eye View Object DetectionKITTI PedestrianmAP0.559PiFeNet
3D Object DetectionKITTI Pedestrian ModerateAverage Precision0.4671PiFeNet
3D Object DetectionKITTI Pedestrian HardAverage Precision0.4271PiFeNet
3D Object DetectionKITTI PedestrianmAP0.486PiFeNet
3D Object DetectionKITTI Pedestrian EasyAverage Precision0.5639PiFeNet
2D ClassificationKITTI Pedestrian ModerateAverage Precision0.4671PiFeNet
2D ClassificationKITTI Pedestrian HardAverage Precision0.4271PiFeNet
2D ClassificationKITTI PedestrianmAP0.486PiFeNet
2D ClassificationKITTI Pedestrian EasyAverage Precision0.5639PiFeNet
2D Object DetectionKITTI Pedestrian ModerateAverage Precision0.4671PiFeNet
2D Object DetectionKITTI Pedestrian HardAverage Precision0.4271PiFeNet
2D Object DetectionKITTI PedestrianmAP0.486PiFeNet
2D Object DetectionKITTI Pedestrian EasyAverage Precision0.5639PiFeNet
16kKITTI Pedestrian ModerateAverage Precision0.4671PiFeNet
16kKITTI Pedestrian HardAverage Precision0.4271PiFeNet
16kKITTI PedestrianmAP0.486PiFeNet
16kKITTI Pedestrian EasyAverage Precision0.5639PiFeNet

Related Papers

GEMINUS: Dual-aware Global and Scene-Adaptive Mixture-of-Experts for End-to-End Autonomous Driving2025-07-19AGENTS-LLM: Augmentative GENeration of Challenging Traffic Scenarios with an Agentic LLM Framework2025-07-18World Model-Based End-to-End Scene Generation for Accident Anticipation in Autonomous Driving2025-07-17Orbis: Overcoming Challenges of Long-Horizon Prediction in Driving World Models2025-07-17Channel-wise Motion Features for Efficient Motion Segmentation2025-07-17LaViPlan : Language-Guided Visual Path Planning with RLVR2025-07-17A Real-Time System for Egocentric Hand-Object Interaction Detection in Industrial Domains2025-07-17RS-TinyNet: Stage-wise Feature Fusion Network for Detecting Tiny Objects in Remote Sensing Images2025-07-17