Tianwei Yin, Xingyi Zhou, Philipp Krähenbühl
Three-dimensional objects are commonly represented as 3D boxes in a point-cloud. This representation mimics the well-studied image-based 2D bounding-box detection but comes with additional challenges. Objects in a 3D world do not follow any particular orientation, and box-based detectors have difficulties enumerating all orientations or fitting an axis-aligned bounding box to rotated objects. In this paper, we instead propose to represent, detect, and track 3D objects as points. Our framework, CenterPoint, first detects centers of objects using a keypoint detector and regresses to other attributes, including 3D size, 3D orientation, and velocity. In a second stage, it refines these estimates using additional point features on the object. In CenterPoint, 3D object tracking simplifies to greedy closest-point matching. The resulting detection and tracking algorithm is simple, efficient, and effective. CenterPoint achieved state-of-the-art performance on the nuScenes benchmark for both 3D detection and tracking, with 65.5 NDS and 63.8 AMOTA for a single model. On the Waymo Open Dataset, CenterPoint outperforms all previous single model method by a large margin and ranks first among all Lidar-only submissions. The code and pretrained models are available at https://github.com/tianweiy/CenterPoint.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Autonomous Vehicles | nuScenes validation set | AMOTA | 77.3 | Center-based Tracking |
| Multi-Object Tracking | nuScenes | AMOTA | 0.64 | CenterPoint-Single |
| Object Tracking | nuScenes | AMOTA | 0.64 | CenterPoint-Single |
| Object Detection | nuScenes LiDAR only | NDS | 67.3 | CenterPoint |
| Object Detection | nuScenes LiDAR only | NDS (val) | 66.8 | CenterPoint |
| Object Detection | nuScenes LiDAR only | mAP | 60.3 | CenterPoint |
| Object Detection | nuScenes LiDAR only | mAP (val) | 59.6 | CenterPoint |
| Object Detection | waymo all_ns | APH/L2 | 71.93 | CenterPoint |
| Object Detection | nuScenes | NDS | 0.71 | CenterPoint |
| Object Detection | nuScenes | mAAE | 0.14 | CenterPoint |
| Object Detection | nuScenes | mAOE | 0.35 | CenterPoint |
| Object Detection | nuScenes | mAP | 0.67 | CenterPoint |
| Object Detection | nuScenes | mASE | 0.24 | CenterPoint |
| Object Detection | nuScenes | mATE | 0.25 | CenterPoint |
| Object Detection | nuScenes | mAVE | 0.25 | CenterPoint |
| Object Detection | ONCE | mAP | 60.1 | CenterPoint |
| Object Detection | Waymo Open Dataset | mAPH/L2 | 65.8 | CenterPoint |
| Object Detection | waymo cyclist | APH/L2 | 71.28 | CenterPoint |
| Object Detection | waymo pedestrian | APH/L2 | 71.52 | CenterPoint |
| Object Detection | nuScenes-C | mean Corruption Error (mCE) | 100 | CenterPoint-PP |
| 3D | nuScenes LiDAR only | NDS | 67.3 | CenterPoint |
| 3D | nuScenes LiDAR only | NDS (val) | 66.8 | CenterPoint |
| 3D | nuScenes LiDAR only | mAP | 60.3 | CenterPoint |
| 3D | nuScenes LiDAR only | mAP (val) | 59.6 | CenterPoint |
| 3D | waymo all_ns | APH/L2 | 71.93 | CenterPoint |
| 3D | nuScenes | NDS | 0.71 | CenterPoint |
| 3D | nuScenes | mAAE | 0.14 | CenterPoint |
| 3D | nuScenes | mAOE | 0.35 | CenterPoint |
| 3D | nuScenes | mAP | 0.67 | CenterPoint |
| 3D | nuScenes | mASE | 0.24 | CenterPoint |
| 3D | nuScenes | mATE | 0.25 | CenterPoint |
| 3D | nuScenes | mAVE | 0.25 | CenterPoint |
| 3D | ONCE | mAP | 60.1 | CenterPoint |
| 3D | Waymo Open Dataset | mAPH/L2 | 65.8 | CenterPoint |
| 3D | waymo cyclist | APH/L2 | 71.28 | CenterPoint |
| 3D | waymo pedestrian | APH/L2 | 71.52 | CenterPoint |
| 3D | nuScenes-C | mean Corruption Error (mCE) | 100 | CenterPoint-PP |
| Autonomous Driving | nuScenes validation set | AMOTA | 77.3 | Center-based Tracking |
| 3D Object Detection | nuScenes LiDAR only | NDS | 67.3 | CenterPoint |
| 3D Object Detection | nuScenes LiDAR only | NDS (val) | 66.8 | CenterPoint |
| 3D Object Detection | nuScenes LiDAR only | mAP | 60.3 | CenterPoint |
| 3D Object Detection | nuScenes LiDAR only | mAP (val) | 59.6 | CenterPoint |
| 3D Object Detection | waymo all_ns | APH/L2 | 71.93 | CenterPoint |
| 3D Object Detection | nuScenes | NDS | 0.71 | CenterPoint |
| 3D Object Detection | nuScenes | mAAE | 0.14 | CenterPoint |
| 3D Object Detection | nuScenes | mAOE | 0.35 | CenterPoint |
| 3D Object Detection | nuScenes | mAP | 0.67 | CenterPoint |
| 3D Object Detection | nuScenes | mASE | 0.24 | CenterPoint |
| 3D Object Detection | nuScenes | mATE | 0.25 | CenterPoint |
| 3D Object Detection | nuScenes | mAVE | 0.25 | CenterPoint |
| 3D Object Detection | ONCE | mAP | 60.1 | CenterPoint |
| 3D Object Detection | Waymo Open Dataset | mAPH/L2 | 65.8 | CenterPoint |
| 3D Object Detection | waymo cyclist | APH/L2 | 71.28 | CenterPoint |
| 3D Object Detection | waymo pedestrian | APH/L2 | 71.52 | CenterPoint |
| 3D Object Detection | nuScenes-C | mean Corruption Error (mCE) | 100 | CenterPoint-PP |
| 3D Multi-Object Tracking | nuScenes | AMOTA | 0.64 | CenterPoint-Single |
| 2D Classification | nuScenes LiDAR only | NDS | 67.3 | CenterPoint |
| 2D Classification | nuScenes LiDAR only | NDS (val) | 66.8 | CenterPoint |
| 2D Classification | nuScenes LiDAR only | mAP | 60.3 | CenterPoint |
| 2D Classification | nuScenes LiDAR only | mAP (val) | 59.6 | CenterPoint |
| 2D Classification | waymo all_ns | APH/L2 | 71.93 | CenterPoint |
| 2D Classification | nuScenes | NDS | 0.71 | CenterPoint |
| 2D Classification | nuScenes | mAAE | 0.14 | CenterPoint |
| 2D Classification | nuScenes | mAOE | 0.35 | CenterPoint |
| 2D Classification | nuScenes | mAP | 0.67 | CenterPoint |
| 2D Classification | nuScenes | mASE | 0.24 | CenterPoint |
| 2D Classification | nuScenes | mATE | 0.25 | CenterPoint |
| 2D Classification | nuScenes | mAVE | 0.25 | CenterPoint |
| 2D Classification | ONCE | mAP | 60.1 | CenterPoint |
| 2D Classification | Waymo Open Dataset | mAPH/L2 | 65.8 | CenterPoint |
| 2D Classification | waymo cyclist | APH/L2 | 71.28 | CenterPoint |
| 2D Classification | waymo pedestrian | APH/L2 | 71.52 | CenterPoint |
| 2D Classification | nuScenes-C | mean Corruption Error (mCE) | 100 | CenterPoint-PP |
| 2D Object Detection | nuScenes LiDAR only | NDS | 67.3 | CenterPoint |
| 2D Object Detection | nuScenes LiDAR only | NDS (val) | 66.8 | CenterPoint |
| 2D Object Detection | nuScenes LiDAR only | mAP | 60.3 | CenterPoint |
| 2D Object Detection | nuScenes LiDAR only | mAP (val) | 59.6 | CenterPoint |
| 2D Object Detection | waymo all_ns | APH/L2 | 71.93 | CenterPoint |
| 2D Object Detection | nuScenes | NDS | 0.71 | CenterPoint |
| 2D Object Detection | nuScenes | mAAE | 0.14 | CenterPoint |
| 2D Object Detection | nuScenes | mAOE | 0.35 | CenterPoint |
| 2D Object Detection | nuScenes | mAP | 0.67 | CenterPoint |
| 2D Object Detection | nuScenes | mASE | 0.24 | CenterPoint |
| 2D Object Detection | nuScenes | mATE | 0.25 | CenterPoint |
| 2D Object Detection | nuScenes | mAVE | 0.25 | CenterPoint |
| 2D Object Detection | ONCE | mAP | 60.1 | CenterPoint |
| 2D Object Detection | Waymo Open Dataset | mAPH/L2 | 65.8 | CenterPoint |
| 2D Object Detection | waymo cyclist | APH/L2 | 71.28 | CenterPoint |
| 2D Object Detection | waymo pedestrian | APH/L2 | 71.52 | CenterPoint |
| 2D Object Detection | nuScenes-C | mean Corruption Error (mCE) | 100 | CenterPoint-PP |
| 16k | nuScenes LiDAR only | NDS | 67.3 | CenterPoint |
| 16k | nuScenes LiDAR only | NDS (val) | 66.8 | CenterPoint |
| 16k | nuScenes LiDAR only | mAP | 60.3 | CenterPoint |
| 16k | nuScenes LiDAR only | mAP (val) | 59.6 | CenterPoint |
| 16k | waymo all_ns | APH/L2 | 71.93 | CenterPoint |
| 16k | nuScenes | NDS | 0.71 | CenterPoint |
| 16k | nuScenes | mAAE | 0.14 | CenterPoint |
| 16k | nuScenes | mAOE | 0.35 | CenterPoint |
| 16k | nuScenes | mAP | 0.67 | CenterPoint |
| 16k | nuScenes | mASE | 0.24 | CenterPoint |
| 16k | nuScenes | mATE | 0.25 | CenterPoint |
| 16k | nuScenes | mAVE | 0.25 | CenterPoint |
| 16k | ONCE | mAP | 60.1 | CenterPoint |
| 16k | Waymo Open Dataset | mAPH/L2 | 65.8 | CenterPoint |
| 16k | waymo cyclist | APH/L2 | 71.28 | CenterPoint |
| 16k | waymo pedestrian | APH/L2 | 71.52 | CenterPoint |
| 16k | nuScenes-C | mean Corruption Error (mCE) | 100 | CenterPoint-PP |