Yanghao Li, Yuntao Chen, Naiyan Wang, Zhao-Xiang Zhang
Scale variation is one of the key challenges in object detection. In this work, we first present a controlled experiment to investigate the effect of receptive fields for scale variation in object detection. Based on the findings from the exploration experiments, we propose a novel Trident Network (TridentNet) aiming to generate scale-specific feature maps with a uniform representational power. We construct a parallel multi-branch architecture in which each branch shares the same transformation parameters but with different receptive fields. Then, we adopt a scale-aware training scheme to specialize each branch by sampling object instances of proper scales for training. As a bonus, a fast approximation version of TridentNet could achieve significant improvements without any additional parameters and computational cost compared with the vanilla detector. On the COCO dataset, our TridentNet with ResNet-101 backbone achieves state-of-the-art single-model results of 48.4 mAP. Codes are available at https://git.io/fj5vR.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Object Detection | COCO test-dev | AP50 | 69.7 | TridentNet (ResNet-101-Deformable, Image Pyramid) |
| Object Detection | COCO test-dev | AP75 | 53.5 | TridentNet (ResNet-101-Deformable, Image Pyramid) |
| Object Detection | COCO test-dev | APL | 60.3 | TridentNet (ResNet-101-Deformable, Image Pyramid) |
| Object Detection | COCO test-dev | APM | 51.3 | TridentNet (ResNet-101-Deformable, Image Pyramid) |
| Object Detection | COCO test-dev | APS | 31.8 | TridentNet (ResNet-101-Deformable, Image Pyramid) |
| Object Detection | COCO test-dev | box mAP | 48.4 | TridentNet (ResNet-101-Deformable, Image Pyramid) |
| Object Detection | COCO test-dev | AP50 | 63.6 | TridentNet (ResNet-101) |
| Object Detection | COCO test-dev | AP75 | 46.5 | TridentNet (ResNet-101) |
| Object Detection | COCO test-dev | APL | 56.6 | TridentNet (ResNet-101) |
| Object Detection | COCO test-dev | APM | 46.6 | TridentNet (ResNet-101) |
| Object Detection | COCO test-dev | APS | 23.9 | TridentNet (ResNet-101) |
| Object Detection | COCO test-dev | box mAP | 42.7 | TridentNet (ResNet-101) |
| Object Detection | COCO minival | AP50 | 63.5 | TridentNet (ResNet-101) |
| Object Detection | COCO minival | AP75 | 45.5 | TridentNet (ResNet-101) |
| Object Detection | COCO minival | APL | 56.9 | TridentNet (ResNet-101) |
| Object Detection | COCO minival | APM | 47 | TridentNet (ResNet-101) |
| Object Detection | COCO minival | APS | 24.9 | TridentNet (ResNet-101) |
| Object Detection | COCO minival | box AP | 42 | TridentNet (ResNet-101) |
| 3D | COCO test-dev | AP50 | 69.7 | TridentNet (ResNet-101-Deformable, Image Pyramid) |
| 3D | COCO test-dev | AP75 | 53.5 | TridentNet (ResNet-101-Deformable, Image Pyramid) |
| 3D | COCO test-dev | APL | 60.3 | TridentNet (ResNet-101-Deformable, Image Pyramid) |
| 3D | COCO test-dev | APM | 51.3 | TridentNet (ResNet-101-Deformable, Image Pyramid) |
| 3D | COCO test-dev | APS | 31.8 | TridentNet (ResNet-101-Deformable, Image Pyramid) |
| 3D | COCO test-dev | box mAP | 48.4 | TridentNet (ResNet-101-Deformable, Image Pyramid) |
| 3D | COCO test-dev | AP50 | 63.6 | TridentNet (ResNet-101) |
| 3D | COCO test-dev | AP75 | 46.5 | TridentNet (ResNet-101) |
| 3D | COCO test-dev | APL | 56.6 | TridentNet (ResNet-101) |
| 3D | COCO test-dev | APM | 46.6 | TridentNet (ResNet-101) |
| 3D | COCO test-dev | APS | 23.9 | TridentNet (ResNet-101) |
| 3D | COCO test-dev | box mAP | 42.7 | TridentNet (ResNet-101) |
| 3D | COCO minival | AP50 | 63.5 | TridentNet (ResNet-101) |
| 3D | COCO minival | AP75 | 45.5 | TridentNet (ResNet-101) |
| 3D | COCO minival | APL | 56.9 | TridentNet (ResNet-101) |
| 3D | COCO minival | APM | 47 | TridentNet (ResNet-101) |
| 3D | COCO minival | APS | 24.9 | TridentNet (ResNet-101) |
| 3D | COCO minival | box AP | 42 | TridentNet (ResNet-101) |
| 2D Classification | COCO test-dev | AP50 | 69.7 | TridentNet (ResNet-101-Deformable, Image Pyramid) |
| 2D Classification | COCO test-dev | AP75 | 53.5 | TridentNet (ResNet-101-Deformable, Image Pyramid) |
| 2D Classification | COCO test-dev | APL | 60.3 | TridentNet (ResNet-101-Deformable, Image Pyramid) |
| 2D Classification | COCO test-dev | APM | 51.3 | TridentNet (ResNet-101-Deformable, Image Pyramid) |
| 2D Classification | COCO test-dev | APS | 31.8 | TridentNet (ResNet-101-Deformable, Image Pyramid) |
| 2D Classification | COCO test-dev | box mAP | 48.4 | TridentNet (ResNet-101-Deformable, Image Pyramid) |
| 2D Classification | COCO test-dev | AP50 | 63.6 | TridentNet (ResNet-101) |
| 2D Classification | COCO test-dev | AP75 | 46.5 | TridentNet (ResNet-101) |
| 2D Classification | COCO test-dev | APL | 56.6 | TridentNet (ResNet-101) |
| 2D Classification | COCO test-dev | APM | 46.6 | TridentNet (ResNet-101) |
| 2D Classification | COCO test-dev | APS | 23.9 | TridentNet (ResNet-101) |
| 2D Classification | COCO test-dev | box mAP | 42.7 | TridentNet (ResNet-101) |
| 2D Classification | COCO minival | AP50 | 63.5 | TridentNet (ResNet-101) |
| 2D Classification | COCO minival | AP75 | 45.5 | TridentNet (ResNet-101) |
| 2D Classification | COCO minival | APL | 56.9 | TridentNet (ResNet-101) |
| 2D Classification | COCO minival | APM | 47 | TridentNet (ResNet-101) |
| 2D Classification | COCO minival | APS | 24.9 | TridentNet (ResNet-101) |
| 2D Classification | COCO minival | box AP | 42 | TridentNet (ResNet-101) |
| 2D Object Detection | COCO test-dev | AP50 | 69.7 | TridentNet (ResNet-101-Deformable, Image Pyramid) |
| 2D Object Detection | COCO test-dev | AP75 | 53.5 | TridentNet (ResNet-101-Deformable, Image Pyramid) |
| 2D Object Detection | COCO test-dev | APL | 60.3 | TridentNet (ResNet-101-Deformable, Image Pyramid) |
| 2D Object Detection | COCO test-dev | APM | 51.3 | TridentNet (ResNet-101-Deformable, Image Pyramid) |
| 2D Object Detection | COCO test-dev | APS | 31.8 | TridentNet (ResNet-101-Deformable, Image Pyramid) |
| 2D Object Detection | COCO test-dev | box mAP | 48.4 | TridentNet (ResNet-101-Deformable, Image Pyramid) |
| 2D Object Detection | COCO test-dev | AP50 | 63.6 | TridentNet (ResNet-101) |
| 2D Object Detection | COCO test-dev | AP75 | 46.5 | TridentNet (ResNet-101) |
| 2D Object Detection | COCO test-dev | APL | 56.6 | TridentNet (ResNet-101) |
| 2D Object Detection | COCO test-dev | APM | 46.6 | TridentNet (ResNet-101) |
| 2D Object Detection | COCO test-dev | APS | 23.9 | TridentNet (ResNet-101) |
| 2D Object Detection | COCO test-dev | box mAP | 42.7 | TridentNet (ResNet-101) |
| 2D Object Detection | COCO minival | AP50 | 63.5 | TridentNet (ResNet-101) |
| 2D Object Detection | COCO minival | AP75 | 45.5 | TridentNet (ResNet-101) |
| 2D Object Detection | COCO minival | APL | 56.9 | TridentNet (ResNet-101) |
| 2D Object Detection | COCO minival | APM | 47 | TridentNet (ResNet-101) |
| 2D Object Detection | COCO minival | APS | 24.9 | TridentNet (ResNet-101) |
| 2D Object Detection | COCO minival | box AP | 42 | TridentNet (ResNet-101) |
| 16k | COCO test-dev | AP50 | 69.7 | TridentNet (ResNet-101-Deformable, Image Pyramid) |
| 16k | COCO test-dev | AP75 | 53.5 | TridentNet (ResNet-101-Deformable, Image Pyramid) |
| 16k | COCO test-dev | APL | 60.3 | TridentNet (ResNet-101-Deformable, Image Pyramid) |
| 16k | COCO test-dev | APM | 51.3 | TridentNet (ResNet-101-Deformable, Image Pyramid) |
| 16k | COCO test-dev | APS | 31.8 | TridentNet (ResNet-101-Deformable, Image Pyramid) |
| 16k | COCO test-dev | box mAP | 48.4 | TridentNet (ResNet-101-Deformable, Image Pyramid) |
| 16k | COCO test-dev | AP50 | 63.6 | TridentNet (ResNet-101) |
| 16k | COCO test-dev | AP75 | 46.5 | TridentNet (ResNet-101) |
| 16k | COCO test-dev | APL | 56.6 | TridentNet (ResNet-101) |
| 16k | COCO test-dev | APM | 46.6 | TridentNet (ResNet-101) |
| 16k | COCO test-dev | APS | 23.9 | TridentNet (ResNet-101) |
| 16k | COCO test-dev | box mAP | 42.7 | TridentNet (ResNet-101) |
| 16k | COCO minival | AP50 | 63.5 | TridentNet (ResNet-101) |
| 16k | COCO minival | AP75 | 45.5 | TridentNet (ResNet-101) |
| 16k | COCO minival | APL | 56.9 | TridentNet (ResNet-101) |
| 16k | COCO minival | APM | 47 | TridentNet (ResNet-101) |
| 16k | COCO minival | APS | 24.9 | TridentNet (ResNet-101) |
| 16k | COCO minival | box AP | 42 | TridentNet (ResNet-101) |