Xin Lu, Buyu Li, Yuxin Yue, Quanquan Li, Junjie Yan
This paper proposes a novel object detection framework named Grid R-CNN, which adopts a grid guided localization mechanism for accurate object detection. Different from the traditional regression based methods, the Grid R-CNN captures the spatial information explicitly and enjoys the position sensitive property of fully convolutional architecture. Instead of using only two independent points, we design a multi-point supervision formulation to encode more clues in order to reduce the impact of inaccurate prediction of specific points. To take the full advantage of the correlation of points in a grid, we propose a two-stage information fusion strategy to fuse feature maps of neighbor grid points. The grid guided localization approach is easy to be extended to different state-of-the-art detection frameworks. Grid R-CNN leads to high quality object localization, and experiments demonstrate that it achieves a 4.1% AP gain at IoU=0.8 and a 10.0% AP gain at IoU=0.9 on COCO benchmark compared to Faster R-CNN with Res50 backbone and FPN architecture.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Object Detection | COCO test-dev | AP50 | 63 | Grid R-CNN (ResNeXt-101-FPN) |
| Object Detection | COCO test-dev | AP75 | 46.6 | Grid R-CNN (ResNeXt-101-FPN) |
| Object Detection | COCO test-dev | APL | 55.2 | Grid R-CNN (ResNeXt-101-FPN) |
| Object Detection | COCO test-dev | APM | 46.5 | Grid R-CNN (ResNeXt-101-FPN) |
| Object Detection | COCO test-dev | APS | 25.1 | Grid R-CNN (ResNeXt-101-FPN) |
| Object Detection | COCO test-dev | box mAP | 43.2 | Grid R-CNN (ResNeXt-101-FPN) |
| Object Detection | COCO minival | AP50 | 60.3 | Grid R-CNN (ResNet-101-FPN) |
| Object Detection | COCO minival | AP75 | 44.4 | Grid R-CNN (ResNet-101-FPN) |
| Object Detection | COCO minival | APL | 54.1 | Grid R-CNN (ResNet-101-FPN) |
| Object Detection | COCO minival | APM | 45.8 | Grid R-CNN (ResNet-101-FPN) |
| Object Detection | COCO minival | APS | 23.4 | Grid R-CNN (ResNet-101-FPN) |
| Object Detection | COCO minival | box AP | 41.3 | Grid R-CNN (ResNet-101-FPN) |
| Object Detection | COCO minival | AP50 | 58.3 | Grid R-CNN (ResNet-50-FPN) |
| Object Detection | COCO minival | AP75 | 42.4 | Grid R-CNN (ResNet-50-FPN) |
| Object Detection | COCO minival | APL | 51.5 | Grid R-CNN (ResNet-50-FPN) |
| Object Detection | COCO minival | APM | 43.8 | Grid R-CNN (ResNet-50-FPN) |
| Object Detection | COCO minival | APS | 22.6 | Grid R-CNN (ResNet-50-FPN) |
| Object Detection | COCO minival | box AP | 39.6 | Grid R-CNN (ResNet-50-FPN) |
| 3D | COCO test-dev | AP50 | 63 | Grid R-CNN (ResNeXt-101-FPN) |
| 3D | COCO test-dev | AP75 | 46.6 | Grid R-CNN (ResNeXt-101-FPN) |
| 3D | COCO test-dev | APL | 55.2 | Grid R-CNN (ResNeXt-101-FPN) |
| 3D | COCO test-dev | APM | 46.5 | Grid R-CNN (ResNeXt-101-FPN) |
| 3D | COCO test-dev | APS | 25.1 | Grid R-CNN (ResNeXt-101-FPN) |
| 3D | COCO test-dev | box mAP | 43.2 | Grid R-CNN (ResNeXt-101-FPN) |
| 3D | COCO minival | AP50 | 60.3 | Grid R-CNN (ResNet-101-FPN) |
| 3D | COCO minival | AP75 | 44.4 | Grid R-CNN (ResNet-101-FPN) |
| 3D | COCO minival | APL | 54.1 | Grid R-CNN (ResNet-101-FPN) |
| 3D | COCO minival | APM | 45.8 | Grid R-CNN (ResNet-101-FPN) |
| 3D | COCO minival | APS | 23.4 | Grid R-CNN (ResNet-101-FPN) |
| 3D | COCO minival | box AP | 41.3 | Grid R-CNN (ResNet-101-FPN) |
| 3D | COCO minival | AP50 | 58.3 | Grid R-CNN (ResNet-50-FPN) |
| 3D | COCO minival | AP75 | 42.4 | Grid R-CNN (ResNet-50-FPN) |
| 3D | COCO minival | APL | 51.5 | Grid R-CNN (ResNet-50-FPN) |
| 3D | COCO minival | APM | 43.8 | Grid R-CNN (ResNet-50-FPN) |
| 3D | COCO minival | APS | 22.6 | Grid R-CNN (ResNet-50-FPN) |
| 3D | COCO minival | box AP | 39.6 | Grid R-CNN (ResNet-50-FPN) |
| 2D Classification | COCO test-dev | AP50 | 63 | Grid R-CNN (ResNeXt-101-FPN) |
| 2D Classification | COCO test-dev | AP75 | 46.6 | Grid R-CNN (ResNeXt-101-FPN) |
| 2D Classification | COCO test-dev | APL | 55.2 | Grid R-CNN (ResNeXt-101-FPN) |
| 2D Classification | COCO test-dev | APM | 46.5 | Grid R-CNN (ResNeXt-101-FPN) |
| 2D Classification | COCO test-dev | APS | 25.1 | Grid R-CNN (ResNeXt-101-FPN) |
| 2D Classification | COCO test-dev | box mAP | 43.2 | Grid R-CNN (ResNeXt-101-FPN) |
| 2D Classification | COCO minival | AP50 | 60.3 | Grid R-CNN (ResNet-101-FPN) |
| 2D Classification | COCO minival | AP75 | 44.4 | Grid R-CNN (ResNet-101-FPN) |
| 2D Classification | COCO minival | APL | 54.1 | Grid R-CNN (ResNet-101-FPN) |
| 2D Classification | COCO minival | APM | 45.8 | Grid R-CNN (ResNet-101-FPN) |
| 2D Classification | COCO minival | APS | 23.4 | Grid R-CNN (ResNet-101-FPN) |
| 2D Classification | COCO minival | box AP | 41.3 | Grid R-CNN (ResNet-101-FPN) |
| 2D Classification | COCO minival | AP50 | 58.3 | Grid R-CNN (ResNet-50-FPN) |
| 2D Classification | COCO minival | AP75 | 42.4 | Grid R-CNN (ResNet-50-FPN) |
| 2D Classification | COCO minival | APL | 51.5 | Grid R-CNN (ResNet-50-FPN) |
| 2D Classification | COCO minival | APM | 43.8 | Grid R-CNN (ResNet-50-FPN) |
| 2D Classification | COCO minival | APS | 22.6 | Grid R-CNN (ResNet-50-FPN) |
| 2D Classification | COCO minival | box AP | 39.6 | Grid R-CNN (ResNet-50-FPN) |
| 2D Object Detection | SARDet-100K | box mAP | 48.8 | Grid RCNN |
| 2D Object Detection | COCO test-dev | AP50 | 63 | Grid R-CNN (ResNeXt-101-FPN) |
| 2D Object Detection | COCO test-dev | AP75 | 46.6 | Grid R-CNN (ResNeXt-101-FPN) |
| 2D Object Detection | COCO test-dev | APL | 55.2 | Grid R-CNN (ResNeXt-101-FPN) |
| 2D Object Detection | COCO test-dev | APM | 46.5 | Grid R-CNN (ResNeXt-101-FPN) |
| 2D Object Detection | COCO test-dev | APS | 25.1 | Grid R-CNN (ResNeXt-101-FPN) |
| 2D Object Detection | COCO test-dev | box mAP | 43.2 | Grid R-CNN (ResNeXt-101-FPN) |
| 2D Object Detection | COCO minival | AP50 | 60.3 | Grid R-CNN (ResNet-101-FPN) |
| 2D Object Detection | COCO minival | AP75 | 44.4 | Grid R-CNN (ResNet-101-FPN) |
| 2D Object Detection | COCO minival | APL | 54.1 | Grid R-CNN (ResNet-101-FPN) |
| 2D Object Detection | COCO minival | APM | 45.8 | Grid R-CNN (ResNet-101-FPN) |
| 2D Object Detection | COCO minival | APS | 23.4 | Grid R-CNN (ResNet-101-FPN) |
| 2D Object Detection | COCO minival | box AP | 41.3 | Grid R-CNN (ResNet-101-FPN) |
| 2D Object Detection | COCO minival | AP50 | 58.3 | Grid R-CNN (ResNet-50-FPN) |
| 2D Object Detection | COCO minival | AP75 | 42.4 | Grid R-CNN (ResNet-50-FPN) |
| 2D Object Detection | COCO minival | APL | 51.5 | Grid R-CNN (ResNet-50-FPN) |
| 2D Object Detection | COCO minival | APM | 43.8 | Grid R-CNN (ResNet-50-FPN) |
| 2D Object Detection | COCO minival | APS | 22.6 | Grid R-CNN (ResNet-50-FPN) |
| 2D Object Detection | COCO minival | box AP | 39.6 | Grid R-CNN (ResNet-50-FPN) |
| 16k | COCO test-dev | AP50 | 63 | Grid R-CNN (ResNeXt-101-FPN) |
| 16k | COCO test-dev | AP75 | 46.6 | Grid R-CNN (ResNeXt-101-FPN) |
| 16k | COCO test-dev | APL | 55.2 | Grid R-CNN (ResNeXt-101-FPN) |
| 16k | COCO test-dev | APM | 46.5 | Grid R-CNN (ResNeXt-101-FPN) |
| 16k | COCO test-dev | APS | 25.1 | Grid R-CNN (ResNeXt-101-FPN) |
| 16k | COCO test-dev | box mAP | 43.2 | Grid R-CNN (ResNeXt-101-FPN) |
| 16k | COCO minival | AP50 | 60.3 | Grid R-CNN (ResNet-101-FPN) |
| 16k | COCO minival | AP75 | 44.4 | Grid R-CNN (ResNet-101-FPN) |
| 16k | COCO minival | APL | 54.1 | Grid R-CNN (ResNet-101-FPN) |
| 16k | COCO minival | APM | 45.8 | Grid R-CNN (ResNet-101-FPN) |
| 16k | COCO minival | APS | 23.4 | Grid R-CNN (ResNet-101-FPN) |
| 16k | COCO minival | box AP | 41.3 | Grid R-CNN (ResNet-101-FPN) |
| 16k | COCO minival | AP50 | 58.3 | Grid R-CNN (ResNet-50-FPN) |
| 16k | COCO minival | AP75 | 42.4 | Grid R-CNN (ResNet-50-FPN) |
| 16k | COCO minival | APL | 51.5 | Grid R-CNN (ResNet-50-FPN) |
| 16k | COCO minival | APM | 43.8 | Grid R-CNN (ResNet-50-FPN) |
| 16k | COCO minival | APS | 22.6 | Grid R-CNN (ResNet-50-FPN) |
| 16k | COCO minival | box AP | 39.6 | Grid R-CNN (ResNet-50-FPN) |