Accurate and Real-time 3D Pedestrian Detection Using an Efficient Attentive Pillar Network

Duy-Tho Le, Hengcan Shi, Hamid Rezatofighi, Jianfei Cai

2021-12-31Birds Eye View Object Detection Autonomous Driving Pedestrian Detection object-detection 3D Object Detection Object Detection

Paper PDF Code(official)

Abstract

Efficiently and accurately detecting people from 3D point cloud data is of great importance in many robotic and autonomous driving applications. This fundamental perception task is still very challenging due to (i) significant deformations of human body pose and gesture over time and (ii) point cloud sparsity and scarcity for pedestrian class objects. Recent efficient 3D object detection approaches rely on pillar features to detect objects from point cloud data. However, these pillar features do not carry sufficient expressive representations to deal with all the aforementioned challenges in detecting people. To address this shortcoming, we first introduce a stackable Pillar Aware Attention (PAA) module for enhanced pillar features extraction while suppressing noises in the point clouds. By integrating multi-point-channel-pooling, point-wise, channel-wise, and task-aware attention into a simple module, the representation capabilities are boosted while requiring little additional computing resources. We also present Mini-BiFPN, a small yet effective feature network that creates bidirectional information flow and multi-level cross-scale feature fusion to better integrate multi-resolution features. Our proposed framework, namely PiFeNet, has been evaluated on three popular large-scale datasets for 3D pedestrian Detection, i.e. KITTI, JRDB, and nuScenes achieving state-of-the-art (SOTA) performance on KITTI Bird-eye-view (BEV) and JRDB and very competitive performance on nuScenes. Our approach has inference speed of 26 frame-per-second (FPS), making it a real-time detector. The code for our PiFeNet is available at https://github.com/ldtho/PiFeNet.

Results

Task	Dataset	Metric	Value	Model
Object Detection	KITTI Pedestrian Moderate	Average Precision	0.4671	PiFeNet
Object Detection	KITTI Pedestrian Hard	Average Precision	0.4271	PiFeNet
Object Detection	KITTI Pedestrian	mAP	0.486	PiFeNet
Object Detection	KITTI Pedestrian Easy	Average Precision	0.5639	PiFeNet
3D	KITTI Pedestrian Moderate	Average Precision	0.4671	PiFeNet
3D	KITTI Pedestrian Hard	Average Precision	0.4271	PiFeNet
3D	KITTI Pedestrian	mAP	0.486	PiFeNet
3D	KITTI Pedestrian Easy	Average Precision	0.5639	PiFeNet
Birds Eye View Object Detection	KITTI Pedestrian Easy	Average Precision	0.6325	PiFeNet
Birds Eye View Object Detection	KITTI Pedestrian Moderate	Average Precision	0.5392	PiFeNet
Birds Eye View Object Detection	KITTI Pedestrian Hard	Average Precision	0.5053	PiFeNet
Birds Eye View Object Detection	KITTI Pedestrian	mAP	0.559	PiFeNet
3D Object Detection	KITTI Pedestrian Moderate	Average Precision	0.4671	PiFeNet
3D Object Detection	KITTI Pedestrian Hard	Average Precision	0.4271	PiFeNet
3D Object Detection	KITTI Pedestrian	mAP	0.486	PiFeNet
3D Object Detection	KITTI Pedestrian Easy	Average Precision	0.5639	PiFeNet
2D Classification	KITTI Pedestrian Moderate	Average Precision	0.4671	PiFeNet
2D Classification	KITTI Pedestrian Hard	Average Precision	0.4271	PiFeNet
2D Classification	KITTI Pedestrian	mAP	0.486	PiFeNet
2D Classification	KITTI Pedestrian Easy	Average Precision	0.5639	PiFeNet
2D Object Detection	KITTI Pedestrian Moderate	Average Precision	0.4671	PiFeNet
2D Object Detection	KITTI Pedestrian Hard	Average Precision	0.4271	PiFeNet
2D Object Detection	KITTI Pedestrian	mAP	0.486	PiFeNet
2D Object Detection	KITTI Pedestrian Easy	Average Precision	0.5639	PiFeNet
16k	KITTI Pedestrian Moderate	Average Precision	0.4671	PiFeNet
16k	KITTI Pedestrian Hard	Average Precision	0.4271	PiFeNet
16k	KITTI Pedestrian	mAP	0.486	PiFeNet
16k	KITTI Pedestrian Easy	Average Precision	0.5639	PiFeNet

Accurate and Real-time 3D Pedestrian Detection Using an Efficient Attentive Pillar Network

Abstract

Results

Related Papers

Accurate and Real-time 3D Pedestrian Detection Using an Efficient Attentive Pillar Network

Abstract

Results

Related Papers