Charles R. Qi, Wei Liu, Chenxia Wu, Hao Su, Leonidas J. Guibas
In this work, we study 3D object detection from RGB-D data in both indoor and outdoor scenes. While previous methods focus on images or 3D voxels, often obscuring natural 3D patterns and invariances of 3D data, we directly operate on raw point clouds by popping up RGB-D scans. However, a key challenge of this approach is how to efficiently localize objects in point clouds of large-scale scenes (region proposal). Instead of solely relying on 3D proposals, our method leverages both mature 2D object detectors and advanced 3D deep learning for object localization, achieving efficiency as well as high recall for even small objects. Benefited from learning directly in raw point clouds, our method is also able to precisely estimate 3D bounding boxes even under strong occlusion or with very sparse points. Evaluated on KITTI and SUN RGB-D 3D detection benchmarks, our method outperforms the state of the art by remarkable margins while having real-time capability.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Object Detection | KITTI Cars Hard | AP | 62.19 | F-PointNet |
| Object Detection | KITTI Cars Hard val | AP | 62.56 | F-PointNet [Qi:2018fd] |
| Object Detection | SUN-RGBD val | Inference Speed (s) | 0.12 | F-PointNet |
| Object Detection | SUN-RGBD val | mAP@0.25 | 54 | F-PointNet |
| Object Detection | KITTI Cyclist Moderate val | AP | 56.49 | F-PointNet++ [Qi:2018fd] |
| Object Detection | KITTI Cyclist Moderate val | AP | 55.95 | F-PointNet [Qi:2018fd] |
| Object Detection | KITTI Cyclist Easy val | AP | 77.15 | F-PointNet++ [Qi:2018fd] |
| Object Detection | KITTI Cyclist Easy val | AP | 74.54 | F-PointNet [Qi:2018fd] |
| Object Detection | KITTI Cyclist Hard val | AP | 53.37 | F-PointNet++ [Qi:2018fd] |
| Object Detection | KITTI Cyclist Hard val | AP | 52.65 | F-PointNet [Qi:2018fd] |
| Object Detection | KITTI Pedestrian Moderate val | AP | 61.32 | F-PointNet++ [Qi:2018fd] |
| Object Detection | KITTI Pedestrian Moderate val | AP | 55.85 | F-PointNet [Qi:2018fd] |
| Object Detection | KITTI Pedestrian Hard val | AP | 53.59 | F-PointNet++ [Qi:2018fd] |
| Object Detection | KITTI Pedestrian Hard val | AP | 49.28 | F-PointNet [Qi:2018fd] |
| Object Detection | KITTI Cars Moderate val | AP | 69.28 | F-PointNet [Qi:2018fd] |
| Object Detection | SUN-RGBD | mAP@0.25 | 54 | Frustum PointNets |
| Object Detection | KITTI Pedestrian Easy val | AP | 70 | F-PointNet++ [Qi:2018fd] |
| Object Detection | KITTI Pedestrian Easy val | AP | 65.08 | F-PointNet [Qi:2018fd] |
| Object Detection | KITTI Cars Easy val | AP | 83.26 | F-PointNet [Qi:2018fd] |
| Object Detection | SUN RGB-D | AP 0.5 | 56.8 | Frustum Pointnet (RGB) |
| 3D | KITTI Cars Hard | AP | 62.19 | F-PointNet |
| 3D | KITTI Cars Hard val | AP | 62.56 | F-PointNet [Qi:2018fd] |
| 3D | SUN-RGBD val | Inference Speed (s) | 0.12 | F-PointNet |
| 3D | SUN-RGBD val | mAP@0.25 | 54 | F-PointNet |
| 3D | KITTI Cyclist Moderate val | AP | 56.49 | F-PointNet++ [Qi:2018fd] |
| 3D | KITTI Cyclist Moderate val | AP | 55.95 | F-PointNet [Qi:2018fd] |
| 3D | KITTI Cyclist Easy val | AP | 77.15 | F-PointNet++ [Qi:2018fd] |
| 3D | KITTI Cyclist Easy val | AP | 74.54 | F-PointNet [Qi:2018fd] |
| 3D | KITTI Cyclist Hard val | AP | 53.37 | F-PointNet++ [Qi:2018fd] |
| 3D | KITTI Cyclist Hard val | AP | 52.65 | F-PointNet [Qi:2018fd] |
| 3D | KITTI Pedestrian Moderate val | AP | 61.32 | F-PointNet++ [Qi:2018fd] |
| 3D | KITTI Pedestrian Moderate val | AP | 55.85 | F-PointNet [Qi:2018fd] |
| 3D | KITTI Pedestrian Hard val | AP | 53.59 | F-PointNet++ [Qi:2018fd] |
| 3D | KITTI Pedestrian Hard val | AP | 49.28 | F-PointNet [Qi:2018fd] |
| 3D | KITTI Cars Moderate val | AP | 69.28 | F-PointNet [Qi:2018fd] |
| 3D | SUN-RGBD | mAP@0.25 | 54 | Frustum PointNets |
| 3D | KITTI Pedestrian Easy val | AP | 70 | F-PointNet++ [Qi:2018fd] |
| 3D | KITTI Pedestrian Easy val | AP | 65.08 | F-PointNet [Qi:2018fd] |
| 3D | KITTI Cars Easy val | AP | 83.26 | F-PointNet [Qi:2018fd] |
| 3D | SUN RGB-D | AP 0.5 | 56.8 | Frustum Pointnet (RGB) |
| 3D Object Detection | KITTI Cars Hard val | AP | 62.56 | F-PointNet [Qi:2018fd] |
| 3D Object Detection | SUN-RGBD val | Inference Speed (s) | 0.12 | F-PointNet |
| 3D Object Detection | SUN-RGBD val | mAP@0.25 | 54 | F-PointNet |
| 3D Object Detection | KITTI Cyclist Moderate val | AP | 56.49 | F-PointNet++ [Qi:2018fd] |
| 3D Object Detection | KITTI Cyclist Moderate val | AP | 55.95 | F-PointNet [Qi:2018fd] |
| 3D Object Detection | KITTI Cyclist Easy val | AP | 77.15 | F-PointNet++ [Qi:2018fd] |
| 3D Object Detection | KITTI Cyclist Easy val | AP | 74.54 | F-PointNet [Qi:2018fd] |
| 3D Object Detection | KITTI Cyclist Hard val | AP | 53.37 | F-PointNet++ [Qi:2018fd] |
| 3D Object Detection | KITTI Cyclist Hard val | AP | 52.65 | F-PointNet [Qi:2018fd] |
| 3D Object Detection | KITTI Pedestrian Moderate val | AP | 61.32 | F-PointNet++ [Qi:2018fd] |
| 3D Object Detection | KITTI Pedestrian Moderate val | AP | 55.85 | F-PointNet [Qi:2018fd] |
| 3D Object Detection | KITTI Pedestrian Hard val | AP | 53.59 | F-PointNet++ [Qi:2018fd] |
| 3D Object Detection | KITTI Pedestrian Hard val | AP | 49.28 | F-PointNet [Qi:2018fd] |
| 3D Object Detection | KITTI Cars Moderate val | AP | 69.28 | F-PointNet [Qi:2018fd] |
| 3D Object Detection | SUN-RGBD | mAP@0.25 | 54 | Frustum PointNets |
| 3D Object Detection | KITTI Pedestrian Easy val | AP | 70 | F-PointNet++ [Qi:2018fd] |
| 3D Object Detection | KITTI Pedestrian Easy val | AP | 65.08 | F-PointNet [Qi:2018fd] |
| 3D Object Detection | KITTI Cars Easy val | AP | 83.26 | F-PointNet [Qi:2018fd] |
| 2D Classification | KITTI Cars Hard | AP | 62.19 | F-PointNet |
| 2D Classification | KITTI Cars Hard val | AP | 62.56 | F-PointNet [Qi:2018fd] |
| 2D Classification | SUN-RGBD val | Inference Speed (s) | 0.12 | F-PointNet |
| 2D Classification | SUN-RGBD val | mAP@0.25 | 54 | F-PointNet |
| 2D Classification | KITTI Cyclist Moderate val | AP | 56.49 | F-PointNet++ [Qi:2018fd] |
| 2D Classification | KITTI Cyclist Moderate val | AP | 55.95 | F-PointNet [Qi:2018fd] |
| 2D Classification | KITTI Cyclist Easy val | AP | 77.15 | F-PointNet++ [Qi:2018fd] |
| 2D Classification | KITTI Cyclist Easy val | AP | 74.54 | F-PointNet [Qi:2018fd] |
| 2D Classification | KITTI Cyclist Hard val | AP | 53.37 | F-PointNet++ [Qi:2018fd] |
| 2D Classification | KITTI Cyclist Hard val | AP | 52.65 | F-PointNet [Qi:2018fd] |
| 2D Classification | KITTI Pedestrian Moderate val | AP | 61.32 | F-PointNet++ [Qi:2018fd] |
| 2D Classification | KITTI Pedestrian Moderate val | AP | 55.85 | F-PointNet [Qi:2018fd] |
| 2D Classification | KITTI Pedestrian Hard val | AP | 53.59 | F-PointNet++ [Qi:2018fd] |
| 2D Classification | KITTI Pedestrian Hard val | AP | 49.28 | F-PointNet [Qi:2018fd] |
| 2D Classification | KITTI Cars Moderate val | AP | 69.28 | F-PointNet [Qi:2018fd] |
| 2D Classification | SUN-RGBD | mAP@0.25 | 54 | Frustum PointNets |
| 2D Classification | KITTI Pedestrian Easy val | AP | 70 | F-PointNet++ [Qi:2018fd] |
| 2D Classification | KITTI Pedestrian Easy val | AP | 65.08 | F-PointNet [Qi:2018fd] |
| 2D Classification | KITTI Cars Easy val | AP | 83.26 | F-PointNet [Qi:2018fd] |
| 2D Classification | SUN RGB-D | AP 0.5 | 56.8 | Frustum Pointnet (RGB) |
| 2D Object Detection | KITTI Cars Hard | AP | 62.19 | F-PointNet |
| 2D Object Detection | KITTI Cars Hard val | AP | 62.56 | F-PointNet [Qi:2018fd] |
| 2D Object Detection | SUN-RGBD val | Inference Speed (s) | 0.12 | F-PointNet |
| 2D Object Detection | SUN-RGBD val | mAP@0.25 | 54 | F-PointNet |
| 2D Object Detection | KITTI Cyclist Moderate val | AP | 56.49 | F-PointNet++ [Qi:2018fd] |
| 2D Object Detection | KITTI Cyclist Moderate val | AP | 55.95 | F-PointNet [Qi:2018fd] |
| 2D Object Detection | KITTI Cyclist Easy val | AP | 77.15 | F-PointNet++ [Qi:2018fd] |
| 2D Object Detection | KITTI Cyclist Easy val | AP | 74.54 | F-PointNet [Qi:2018fd] |
| 2D Object Detection | KITTI Cyclist Hard val | AP | 53.37 | F-PointNet++ [Qi:2018fd] |
| 2D Object Detection | KITTI Cyclist Hard val | AP | 52.65 | F-PointNet [Qi:2018fd] |
| 2D Object Detection | KITTI Pedestrian Moderate val | AP | 61.32 | F-PointNet++ [Qi:2018fd] |
| 2D Object Detection | KITTI Pedestrian Moderate val | AP | 55.85 | F-PointNet [Qi:2018fd] |
| 2D Object Detection | KITTI Pedestrian Hard val | AP | 53.59 | F-PointNet++ [Qi:2018fd] |
| 2D Object Detection | KITTI Pedestrian Hard val | AP | 49.28 | F-PointNet [Qi:2018fd] |
| 2D Object Detection | KITTI Cars Moderate val | AP | 69.28 | F-PointNet [Qi:2018fd] |
| 2D Object Detection | SUN-RGBD | mAP@0.25 | 54 | Frustum PointNets |
| 2D Object Detection | KITTI Pedestrian Easy val | AP | 70 | F-PointNet++ [Qi:2018fd] |
| 2D Object Detection | KITTI Pedestrian Easy val | AP | 65.08 | F-PointNet [Qi:2018fd] |
| 2D Object Detection | KITTI Cars Easy val | AP | 83.26 | F-PointNet [Qi:2018fd] |
| 2D Object Detection | SUN RGB-D | AP 0.5 | 56.8 | Frustum Pointnet (RGB) |
| 16k | KITTI Cars Hard | AP | 62.19 | F-PointNet |
| 16k | KITTI Cars Hard val | AP | 62.56 | F-PointNet [Qi:2018fd] |
| 16k | SUN-RGBD val | Inference Speed (s) | 0.12 | F-PointNet |
| 16k | SUN-RGBD val | mAP@0.25 | 54 | F-PointNet |
| 16k | KITTI Cyclist Moderate val | AP | 56.49 | F-PointNet++ [Qi:2018fd] |
| 16k | KITTI Cyclist Moderate val | AP | 55.95 | F-PointNet [Qi:2018fd] |
| 16k | KITTI Cyclist Easy val | AP | 77.15 | F-PointNet++ [Qi:2018fd] |
| 16k | KITTI Cyclist Easy val | AP | 74.54 | F-PointNet [Qi:2018fd] |
| 16k | KITTI Cyclist Hard val | AP | 53.37 | F-PointNet++ [Qi:2018fd] |
| 16k | KITTI Cyclist Hard val | AP | 52.65 | F-PointNet [Qi:2018fd] |
| 16k | KITTI Pedestrian Moderate val | AP | 61.32 | F-PointNet++ [Qi:2018fd] |
| 16k | KITTI Pedestrian Moderate val | AP | 55.85 | F-PointNet [Qi:2018fd] |
| 16k | KITTI Pedestrian Hard val | AP | 53.59 | F-PointNet++ [Qi:2018fd] |
| 16k | KITTI Pedestrian Hard val | AP | 49.28 | F-PointNet [Qi:2018fd] |
| 16k | KITTI Cars Moderate val | AP | 69.28 | F-PointNet [Qi:2018fd] |
| 16k | SUN-RGBD | mAP@0.25 | 54 | Frustum PointNets |
| 16k | KITTI Pedestrian Easy val | AP | 70 | F-PointNet++ [Qi:2018fd] |
| 16k | KITTI Pedestrian Easy val | AP | 65.08 | F-PointNet [Qi:2018fd] |
| 16k | KITTI Cars Easy val | AP | 83.26 | F-PointNet [Qi:2018fd] |
| 16k | SUN RGB-D | AP 0.5 | 56.8 | Frustum Pointnet (RGB) |