Xingyi Zhou, Jiacheng Zhuo, Philipp Krähenbühl
With the advent of deep learning, object detection drifted from a bottom-up to a top-down recognition problem. State of the art algorithms enumerate a near-exhaustive list of object locations and classify each into: object or not. In this paper, we show that bottom-up approaches still perform competitively. We detect four extreme points (top-most, left-most, bottom-most, right-most) and one center point of objects using a standard keypoint estimation network. We group the five keypoints into a bounding box if they are geometrically aligned. Object detection is then a purely appearance-based keypoint estimation problem, without region classification or implicit feature learning. The proposed method performs on-par with the state-of-the-art region based detection methods, with a bounding box AP of 43.2% on COCO test-dev. In addition, our estimated extreme points directly span a coarse octagonal mask, with a COCO Mask AP of 18.9%, much better than the Mask AP of vanilla bounding boxes. Extreme point guided segmentation further improves this to 34.6% Mask AP.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Object Detection | COCO test-dev | AP50 | 60.5 | ExtremeNet (Hourglass-104, multi-scale) |
| Object Detection | COCO test-dev | AP75 | 47 | ExtremeNet (Hourglass-104, multi-scale) |
| Object Detection | COCO test-dev | APL | 57.6 | ExtremeNet (Hourglass-104, multi-scale) |
| Object Detection | COCO test-dev | APM | 46.9 | ExtremeNet (Hourglass-104, multi-scale) |
| Object Detection | COCO test-dev | APS | 24.1 | ExtremeNet (Hourglass-104, multi-scale) |
| Object Detection | COCO test-dev | box mAP | 43.7 | ExtremeNet (Hourglass-104, multi-scale) |
| Object Detection | COCO test-dev | AP50 | 55.5 | ExtremeNet (Hourglass-104, single-scale) |
| Object Detection | COCO test-dev | AP75 | 43.2 | ExtremeNet (Hourglass-104, single-scale) |
| Object Detection | COCO test-dev | APL | 53.1 | ExtremeNet (Hourglass-104, single-scale) |
| Object Detection | COCO test-dev | APM | 43.2 | ExtremeNet (Hourglass-104, single-scale) |
| Object Detection | COCO test-dev | APS | 20.4 | ExtremeNet (Hourglass-104, single-scale) |
| Object Detection | COCO test-dev | box mAP | 40.2 | ExtremeNet (Hourglass-104, single-scale) |
| Object Detection | COCO minival | AP50 | 59.6 | ExtremeNet (Hourglass-104, multi-scale) |
| Object Detection | COCO minival | AP75 | 46.8 | ExtremeNet (Hourglass-104, multi-scale) |
| Object Detection | COCO minival | APL | 59.4 | ExtremeNet (Hourglass-104, multi-scale) |
| Object Detection | COCO minival | APM | 46.6 | ExtremeNet (Hourglass-104, multi-scale) |
| Object Detection | COCO minival | APS | 25.7 | ExtremeNet (Hourglass-104, multi-scale) |
| Object Detection | COCO minival | box AP | 43.3 | ExtremeNet (Hourglass-104, multi-scale) |
| Object Detection | COCO minival | AP50 | 55.1 | ExtremeNet (Hourglass-104, single-scale) |
| Object Detection | COCO minival | AP75 | 43.7 | ExtremeNet (Hourglass-104, single-scale) |
| Object Detection | COCO minival | APL | 56.1 | ExtremeNet (Hourglass-104, single-scale) |
| Object Detection | COCO minival | APM | 44 | ExtremeNet (Hourglass-104, single-scale) |
| Object Detection | COCO minival | APS | 21.6 | ExtremeNet (Hourglass-104, single-scale) |
| Object Detection | COCO minival | box AP | 40.3 | ExtremeNet (Hourglass-104, single-scale) |
| 3D | COCO test-dev | AP50 | 60.5 | ExtremeNet (Hourglass-104, multi-scale) |
| 3D | COCO test-dev | AP75 | 47 | ExtremeNet (Hourglass-104, multi-scale) |
| 3D | COCO test-dev | APL | 57.6 | ExtremeNet (Hourglass-104, multi-scale) |
| 3D | COCO test-dev | APM | 46.9 | ExtremeNet (Hourglass-104, multi-scale) |
| 3D | COCO test-dev | APS | 24.1 | ExtremeNet (Hourglass-104, multi-scale) |
| 3D | COCO test-dev | box mAP | 43.7 | ExtremeNet (Hourglass-104, multi-scale) |
| 3D | COCO test-dev | AP50 | 55.5 | ExtremeNet (Hourglass-104, single-scale) |
| 3D | COCO test-dev | AP75 | 43.2 | ExtremeNet (Hourglass-104, single-scale) |
| 3D | COCO test-dev | APL | 53.1 | ExtremeNet (Hourglass-104, single-scale) |
| 3D | COCO test-dev | APM | 43.2 | ExtremeNet (Hourglass-104, single-scale) |
| 3D | COCO test-dev | APS | 20.4 | ExtremeNet (Hourglass-104, single-scale) |
| 3D | COCO test-dev | box mAP | 40.2 | ExtremeNet (Hourglass-104, single-scale) |
| 3D | COCO minival | AP50 | 59.6 | ExtremeNet (Hourglass-104, multi-scale) |
| 3D | COCO minival | AP75 | 46.8 | ExtremeNet (Hourglass-104, multi-scale) |
| 3D | COCO minival | APL | 59.4 | ExtremeNet (Hourglass-104, multi-scale) |
| 3D | COCO minival | APM | 46.6 | ExtremeNet (Hourglass-104, multi-scale) |
| 3D | COCO minival | APS | 25.7 | ExtremeNet (Hourglass-104, multi-scale) |
| 3D | COCO minival | box AP | 43.3 | ExtremeNet (Hourglass-104, multi-scale) |
| 3D | COCO minival | AP50 | 55.1 | ExtremeNet (Hourglass-104, single-scale) |
| 3D | COCO minival | AP75 | 43.7 | ExtremeNet (Hourglass-104, single-scale) |
| 3D | COCO minival | APL | 56.1 | ExtremeNet (Hourglass-104, single-scale) |
| 3D | COCO minival | APM | 44 | ExtremeNet (Hourglass-104, single-scale) |
| 3D | COCO minival | APS | 21.6 | ExtremeNet (Hourglass-104, single-scale) |
| 3D | COCO minival | box AP | 40.3 | ExtremeNet (Hourglass-104, single-scale) |
| 2D Classification | COCO test-dev | AP50 | 60.5 | ExtremeNet (Hourglass-104, multi-scale) |
| 2D Classification | COCO test-dev | AP75 | 47 | ExtremeNet (Hourglass-104, multi-scale) |
| 2D Classification | COCO test-dev | APL | 57.6 | ExtremeNet (Hourglass-104, multi-scale) |
| 2D Classification | COCO test-dev | APM | 46.9 | ExtremeNet (Hourglass-104, multi-scale) |
| 2D Classification | COCO test-dev | APS | 24.1 | ExtremeNet (Hourglass-104, multi-scale) |
| 2D Classification | COCO test-dev | box mAP | 43.7 | ExtremeNet (Hourglass-104, multi-scale) |
| 2D Classification | COCO test-dev | AP50 | 55.5 | ExtremeNet (Hourglass-104, single-scale) |
| 2D Classification | COCO test-dev | AP75 | 43.2 | ExtremeNet (Hourglass-104, single-scale) |
| 2D Classification | COCO test-dev | APL | 53.1 | ExtremeNet (Hourglass-104, single-scale) |
| 2D Classification | COCO test-dev | APM | 43.2 | ExtremeNet (Hourglass-104, single-scale) |
| 2D Classification | COCO test-dev | APS | 20.4 | ExtremeNet (Hourglass-104, single-scale) |
| 2D Classification | COCO test-dev | box mAP | 40.2 | ExtremeNet (Hourglass-104, single-scale) |
| 2D Classification | COCO minival | AP50 | 59.6 | ExtremeNet (Hourglass-104, multi-scale) |
| 2D Classification | COCO minival | AP75 | 46.8 | ExtremeNet (Hourglass-104, multi-scale) |
| 2D Classification | COCO minival | APL | 59.4 | ExtremeNet (Hourglass-104, multi-scale) |
| 2D Classification | COCO minival | APM | 46.6 | ExtremeNet (Hourglass-104, multi-scale) |
| 2D Classification | COCO minival | APS | 25.7 | ExtremeNet (Hourglass-104, multi-scale) |
| 2D Classification | COCO minival | box AP | 43.3 | ExtremeNet (Hourglass-104, multi-scale) |
| 2D Classification | COCO minival | AP50 | 55.1 | ExtremeNet (Hourglass-104, single-scale) |
| 2D Classification | COCO minival | AP75 | 43.7 | ExtremeNet (Hourglass-104, single-scale) |
| 2D Classification | COCO minival | APL | 56.1 | ExtremeNet (Hourglass-104, single-scale) |
| 2D Classification | COCO minival | APM | 44 | ExtremeNet (Hourglass-104, single-scale) |
| 2D Classification | COCO minival | APS | 21.6 | ExtremeNet (Hourglass-104, single-scale) |
| 2D Classification | COCO minival | box AP | 40.3 | ExtremeNet (Hourglass-104, single-scale) |
| 2D Object Detection | COCO test-dev | AP50 | 60.5 | ExtremeNet (Hourglass-104, multi-scale) |
| 2D Object Detection | COCO test-dev | AP75 | 47 | ExtremeNet (Hourglass-104, multi-scale) |
| 2D Object Detection | COCO test-dev | APL | 57.6 | ExtremeNet (Hourglass-104, multi-scale) |
| 2D Object Detection | COCO test-dev | APM | 46.9 | ExtremeNet (Hourglass-104, multi-scale) |
| 2D Object Detection | COCO test-dev | APS | 24.1 | ExtremeNet (Hourglass-104, multi-scale) |
| 2D Object Detection | COCO test-dev | box mAP | 43.7 | ExtremeNet (Hourglass-104, multi-scale) |
| 2D Object Detection | COCO test-dev | AP50 | 55.5 | ExtremeNet (Hourglass-104, single-scale) |
| 2D Object Detection | COCO test-dev | AP75 | 43.2 | ExtremeNet (Hourglass-104, single-scale) |
| 2D Object Detection | COCO test-dev | APL | 53.1 | ExtremeNet (Hourglass-104, single-scale) |
| 2D Object Detection | COCO test-dev | APM | 43.2 | ExtremeNet (Hourglass-104, single-scale) |
| 2D Object Detection | COCO test-dev | APS | 20.4 | ExtremeNet (Hourglass-104, single-scale) |
| 2D Object Detection | COCO test-dev | box mAP | 40.2 | ExtremeNet (Hourglass-104, single-scale) |
| 2D Object Detection | COCO minival | AP50 | 59.6 | ExtremeNet (Hourglass-104, multi-scale) |
| 2D Object Detection | COCO minival | AP75 | 46.8 | ExtremeNet (Hourglass-104, multi-scale) |
| 2D Object Detection | COCO minival | APL | 59.4 | ExtremeNet (Hourglass-104, multi-scale) |
| 2D Object Detection | COCO minival | APM | 46.6 | ExtremeNet (Hourglass-104, multi-scale) |
| 2D Object Detection | COCO minival | APS | 25.7 | ExtremeNet (Hourglass-104, multi-scale) |
| 2D Object Detection | COCO minival | box AP | 43.3 | ExtremeNet (Hourglass-104, multi-scale) |
| 2D Object Detection | COCO minival | AP50 | 55.1 | ExtremeNet (Hourglass-104, single-scale) |
| 2D Object Detection | COCO minival | AP75 | 43.7 | ExtremeNet (Hourglass-104, single-scale) |
| 2D Object Detection | COCO minival | APL | 56.1 | ExtremeNet (Hourglass-104, single-scale) |
| 2D Object Detection | COCO minival | APM | 44 | ExtremeNet (Hourglass-104, single-scale) |
| 2D Object Detection | COCO minival | APS | 21.6 | ExtremeNet (Hourglass-104, single-scale) |
| 2D Object Detection | COCO minival | box AP | 40.3 | ExtremeNet (Hourglass-104, single-scale) |
| 16k | COCO test-dev | AP50 | 60.5 | ExtremeNet (Hourglass-104, multi-scale) |
| 16k | COCO test-dev | AP75 | 47 | ExtremeNet (Hourglass-104, multi-scale) |
| 16k | COCO test-dev | APL | 57.6 | ExtremeNet (Hourglass-104, multi-scale) |
| 16k | COCO test-dev | APM | 46.9 | ExtremeNet (Hourglass-104, multi-scale) |
| 16k | COCO test-dev | APS | 24.1 | ExtremeNet (Hourglass-104, multi-scale) |
| 16k | COCO test-dev | box mAP | 43.7 | ExtremeNet (Hourglass-104, multi-scale) |
| 16k | COCO test-dev | AP50 | 55.5 | ExtremeNet (Hourglass-104, single-scale) |
| 16k | COCO test-dev | AP75 | 43.2 | ExtremeNet (Hourglass-104, single-scale) |
| 16k | COCO test-dev | APL | 53.1 | ExtremeNet (Hourglass-104, single-scale) |
| 16k | COCO test-dev | APM | 43.2 | ExtremeNet (Hourglass-104, single-scale) |
| 16k | COCO test-dev | APS | 20.4 | ExtremeNet (Hourglass-104, single-scale) |
| 16k | COCO test-dev | box mAP | 40.2 | ExtremeNet (Hourglass-104, single-scale) |
| 16k | COCO minival | AP50 | 59.6 | ExtremeNet (Hourglass-104, multi-scale) |
| 16k | COCO minival | AP75 | 46.8 | ExtremeNet (Hourglass-104, multi-scale) |
| 16k | COCO minival | APL | 59.4 | ExtremeNet (Hourglass-104, multi-scale) |
| 16k | COCO minival | APM | 46.6 | ExtremeNet (Hourglass-104, multi-scale) |
| 16k | COCO minival | APS | 25.7 | ExtremeNet (Hourglass-104, multi-scale) |
| 16k | COCO minival | box AP | 43.3 | ExtremeNet (Hourglass-104, multi-scale) |
| 16k | COCO minival | AP50 | 55.1 | ExtremeNet (Hourglass-104, single-scale) |
| 16k | COCO minival | AP75 | 43.7 | ExtremeNet (Hourglass-104, single-scale) |
| 16k | COCO minival | APL | 56.1 | ExtremeNet (Hourglass-104, single-scale) |
| 16k | COCO minival | APM | 44 | ExtremeNet (Hourglass-104, single-scale) |
| 16k | COCO minival | APS | 21.6 | ExtremeNet (Hourglass-104, single-scale) |
| 16k | COCO minival | box AP | 40.3 | ExtremeNet (Hourglass-104, single-scale) |