Leonardo Rossi, Akbar Karimi, Andrea Prati
Within the field of instance segmentation, most of the state-of-the-art deep learning networks rely nowadays on cascade architectures, where multiple object detectors are trained sequentially, re-sampling the ground truth at each step. This offers a solution to the problem of exponentially vanishing positive samples. However, it also translates into an increase in network complexity in terms of the number of parameters. To address this issue, we propose Recursively Refined R-CNN (R^3-CNN) which avoids duplicates by introducing a loop mechanism instead. At the same time, it achieves a quality boost using a recursive re-sampling technique, where a specific IoU quality is utilized in each recursion to eventually equally cover the positive spectrum. Our experiments highlight the specific encoding of the loop mechanism in the weights, requiring its usage at inference time. The R^3-CNN architecture is able to surpass the recently proposed HTC model, while reducing the number of parameters significantly. Experiments on COCO minival 2017 dataset show performance boost independently from the utilized baseline model. The code is available online at https://github.com/IMPLabUniPr/mmdetection/tree/r3_cnn.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Object Detection | COCO minival | AP50 | 64.3 | R3-CNN (ResNet-50-FPN, DCN) |
| Object Detection | COCO minival | AP75 | 48.9 | R3-CNN (ResNet-50-FPN, DCN) |
| Object Detection | COCO minival | APL | 59.6 | R3-CNN (ResNet-50-FPN, DCN) |
| Object Detection | COCO minival | APM | 48.3 | R3-CNN (ResNet-50-FPN, DCN) |
| Object Detection | COCO minival | APS | 26.6 | R3-CNN (ResNet-50-FPN, DCN) |
| Object Detection | COCO minival | box AP | 44.8 | R3-CNN (ResNet-50-FPN, DCN) |
| Object Detection | COCO minival | AP50 | 64.1 | R3-CNN (ResNet-50-FPN, GC-Net) |
| Object Detection | COCO minival | AP75 | 48.4 | R3-CNN (ResNet-50-FPN, GC-Net) |
| Object Detection | COCO minival | APL | 58.9 | R3-CNN (ResNet-50-FPN, GC-Net) |
| Object Detection | COCO minival | APM | 47.1 | R3-CNN (ResNet-50-FPN, GC-Net) |
| Object Detection | COCO minival | APS | 27 | R3-CNN (ResNet-50-FPN, GC-Net) |
| Object Detection | COCO minival | box AP | 44.3 | R3-CNN (ResNet-50-FPN, GC-Net) |
| Object Detection | COCO minival | AP50 | 61 | R3-CNN (ResNet-50-FPN) |
| Object Detection | COCO minival | AP75 | 46.3 | R3-CNN (ResNet-50-FPN) |
| Object Detection | COCO minival | APL | 55.7 | R3-CNN (ResNet-50-FPN) |
| Object Detection | COCO minival | APM | 45.2 | R3-CNN (ResNet-50-FPN) |
| Object Detection | COCO minival | APS | 24.5 | R3-CNN (ResNet-50-FPN) |
| Object Detection | COCO minival | box AP | 42 | R3-CNN (ResNet-50-FPN) |
| Object Detection | COCO minival | AP50 | 61.2 | R3-CNN (ResNet-50-FPN, GRoIE) |
| Object Detection | COCO minival | AP75 | 45.6 | R3-CNN (ResNet-50-FPN, GRoIE) |
| Object Detection | COCO minival | APS | 24.4 | R3-CNN (ResNet-50-FPN, GRoIE) |
| 3D | COCO minival | AP50 | 64.3 | R3-CNN (ResNet-50-FPN, DCN) |
| 3D | COCO minival | AP75 | 48.9 | R3-CNN (ResNet-50-FPN, DCN) |
| 3D | COCO minival | APL | 59.6 | R3-CNN (ResNet-50-FPN, DCN) |
| 3D | COCO minival | APM | 48.3 | R3-CNN (ResNet-50-FPN, DCN) |
| 3D | COCO minival | APS | 26.6 | R3-CNN (ResNet-50-FPN, DCN) |
| 3D | COCO minival | box AP | 44.8 | R3-CNN (ResNet-50-FPN, DCN) |
| 3D | COCO minival | AP50 | 64.1 | R3-CNN (ResNet-50-FPN, GC-Net) |
| 3D | COCO minival | AP75 | 48.4 | R3-CNN (ResNet-50-FPN, GC-Net) |
| 3D | COCO minival | APL | 58.9 | R3-CNN (ResNet-50-FPN, GC-Net) |
| 3D | COCO minival | APM | 47.1 | R3-CNN (ResNet-50-FPN, GC-Net) |
| 3D | COCO minival | APS | 27 | R3-CNN (ResNet-50-FPN, GC-Net) |
| 3D | COCO minival | box AP | 44.3 | R3-CNN (ResNet-50-FPN, GC-Net) |
| 3D | COCO minival | AP50 | 61 | R3-CNN (ResNet-50-FPN) |
| 3D | COCO minival | AP75 | 46.3 | R3-CNN (ResNet-50-FPN) |
| 3D | COCO minival | APL | 55.7 | R3-CNN (ResNet-50-FPN) |
| 3D | COCO minival | APM | 45.2 | R3-CNN (ResNet-50-FPN) |
| 3D | COCO minival | APS | 24.5 | R3-CNN (ResNet-50-FPN) |
| 3D | COCO minival | box AP | 42 | R3-CNN (ResNet-50-FPN) |
| 3D | COCO minival | AP50 | 61.2 | R3-CNN (ResNet-50-FPN, GRoIE) |
| 3D | COCO minival | AP75 | 45.6 | R3-CNN (ResNet-50-FPN, GRoIE) |
| 3D | COCO minival | APS | 24.4 | R3-CNN (ResNet-50-FPN, GRoIE) |
| Instance Segmentation | coco minval | APL | 56 | R3-CNN (ResNet-50-FPN, GC-Net) |
| Instance Segmentation | COCO minival | AP50 | 61.3 | R3-CNN (ResNet-50-FPN, DCN) |
| Instance Segmentation | COCO minival | AP75 | 44 | R3-CNN (ResNet-50-FPN, DCN) |
| Instance Segmentation | COCO minival | APL | 56.1 | R3-CNN (ResNet-50-FPN, DCN) |
| Instance Segmentation | COCO minival | APM | 43.6 | R3-CNN (ResNet-50-FPN, DCN) |
| Instance Segmentation | COCO minival | APS | 22.3 | R3-CNN (ResNet-50-FPN, DCN) |
| Instance Segmentation | COCO minival | mask AP | 40.4 | R3-CNN (ResNet-50-FPN, DCN) |
| Instance Segmentation | COCO minival | AP50 | 61.1 | R3-CNN (ResNet-50-FPN, GC-Net) |
| Instance Segmentation | COCO minival | AP75 | 43.5 | R3-CNN (ResNet-50-FPN, GC-Net) |
| Instance Segmentation | COCO minival | APM | 42.8 | R3-CNN (ResNet-50-FPN, GC-Net) |
| Instance Segmentation | COCO minival | APS | 22.6 | R3-CNN (ResNet-50-FPN, GC-Net) |
| Instance Segmentation | COCO minival | mask AP | 40.2 | R3-CNN (ResNet-50-FPN, GC-Net) |
| Instance Segmentation | COCO minival | AP50 | 58.8 | R3-CNN (ResNet-50-FPN, GRoIE) |
| Instance Segmentation | COCO minival | AP75 | 42.3 | R3-CNN (ResNet-50-FPN, GRoIE) |
| Instance Segmentation | COCO minival | APL | 54.3 | R3-CNN (ResNet-50-FPN, GRoIE) |
| Instance Segmentation | COCO minival | APM | 42.1 | R3-CNN (ResNet-50-FPN, GRoIE) |
| Instance Segmentation | COCO minival | APS | 20.7 | R3-CNN (ResNet-50-FPN, GRoIE) |
| Instance Segmentation | COCO minival | mask AP | 39.1 | R3-CNN (ResNet-50-FPN, GRoIE) |
| Instance Segmentation | COCO minival | AP50 | 58 | R3-CNN (ResNet-50-FPN) |
| Instance Segmentation | COCO minival | AP75 | 41.4 | R3-CNN (ResNet-50-FPN) |
| Instance Segmentation | COCO minival | APL | 52.8 | R3-CNN (ResNet-50-FPN) |
| Instance Segmentation | COCO minival | APM | 41 | R3-CNN (ResNet-50-FPN) |
| Instance Segmentation | COCO minival | APS | 20.4 | R3-CNN (ResNet-50-FPN) |
| Instance Segmentation | COCO minival | mask AP | 38.2 | R3-CNN (ResNet-50-FPN) |
| 2D Classification | COCO minival | AP50 | 64.3 | R3-CNN (ResNet-50-FPN, DCN) |
| 2D Classification | COCO minival | AP75 | 48.9 | R3-CNN (ResNet-50-FPN, DCN) |
| 2D Classification | COCO minival | APL | 59.6 | R3-CNN (ResNet-50-FPN, DCN) |
| 2D Classification | COCO minival | APM | 48.3 | R3-CNN (ResNet-50-FPN, DCN) |
| 2D Classification | COCO minival | APS | 26.6 | R3-CNN (ResNet-50-FPN, DCN) |
| 2D Classification | COCO minival | box AP | 44.8 | R3-CNN (ResNet-50-FPN, DCN) |
| 2D Classification | COCO minival | AP50 | 64.1 | R3-CNN (ResNet-50-FPN, GC-Net) |
| 2D Classification | COCO minival | AP75 | 48.4 | R3-CNN (ResNet-50-FPN, GC-Net) |
| 2D Classification | COCO minival | APL | 58.9 | R3-CNN (ResNet-50-FPN, GC-Net) |
| 2D Classification | COCO minival | APM | 47.1 | R3-CNN (ResNet-50-FPN, GC-Net) |
| 2D Classification | COCO minival | APS | 27 | R3-CNN (ResNet-50-FPN, GC-Net) |
| 2D Classification | COCO minival | box AP | 44.3 | R3-CNN (ResNet-50-FPN, GC-Net) |
| 2D Classification | COCO minival | AP50 | 61 | R3-CNN (ResNet-50-FPN) |
| 2D Classification | COCO minival | AP75 | 46.3 | R3-CNN (ResNet-50-FPN) |
| 2D Classification | COCO minival | APL | 55.7 | R3-CNN (ResNet-50-FPN) |
| 2D Classification | COCO minival | APM | 45.2 | R3-CNN (ResNet-50-FPN) |
| 2D Classification | COCO minival | APS | 24.5 | R3-CNN (ResNet-50-FPN) |
| 2D Classification | COCO minival | box AP | 42 | R3-CNN (ResNet-50-FPN) |
| 2D Classification | COCO minival | AP50 | 61.2 | R3-CNN (ResNet-50-FPN, GRoIE) |
| 2D Classification | COCO minival | AP75 | 45.6 | R3-CNN (ResNet-50-FPN, GRoIE) |
| 2D Classification | COCO minival | APS | 24.4 | R3-CNN (ResNet-50-FPN, GRoIE) |
| 2D Object Detection | COCO minival | AP50 | 64.3 | R3-CNN (ResNet-50-FPN, DCN) |
| 2D Object Detection | COCO minival | AP75 | 48.9 | R3-CNN (ResNet-50-FPN, DCN) |
| 2D Object Detection | COCO minival | APL | 59.6 | R3-CNN (ResNet-50-FPN, DCN) |
| 2D Object Detection | COCO minival | APM | 48.3 | R3-CNN (ResNet-50-FPN, DCN) |
| 2D Object Detection | COCO minival | APS | 26.6 | R3-CNN (ResNet-50-FPN, DCN) |
| 2D Object Detection | COCO minival | box AP | 44.8 | R3-CNN (ResNet-50-FPN, DCN) |
| 2D Object Detection | COCO minival | AP50 | 64.1 | R3-CNN (ResNet-50-FPN, GC-Net) |
| 2D Object Detection | COCO minival | AP75 | 48.4 | R3-CNN (ResNet-50-FPN, GC-Net) |
| 2D Object Detection | COCO minival | APL | 58.9 | R3-CNN (ResNet-50-FPN, GC-Net) |
| 2D Object Detection | COCO minival | APM | 47.1 | R3-CNN (ResNet-50-FPN, GC-Net) |
| 2D Object Detection | COCO minival | APS | 27 | R3-CNN (ResNet-50-FPN, GC-Net) |
| 2D Object Detection | COCO minival | box AP | 44.3 | R3-CNN (ResNet-50-FPN, GC-Net) |
| 2D Object Detection | COCO minival | AP50 | 61 | R3-CNN (ResNet-50-FPN) |
| 2D Object Detection | COCO minival | AP75 | 46.3 | R3-CNN (ResNet-50-FPN) |
| 2D Object Detection | COCO minival | APL | 55.7 | R3-CNN (ResNet-50-FPN) |
| 2D Object Detection | COCO minival | APM | 45.2 | R3-CNN (ResNet-50-FPN) |
| 2D Object Detection | COCO minival | APS | 24.5 | R3-CNN (ResNet-50-FPN) |
| 2D Object Detection | COCO minival | box AP | 42 | R3-CNN (ResNet-50-FPN) |
| 2D Object Detection | COCO minival | AP50 | 61.2 | R3-CNN (ResNet-50-FPN, GRoIE) |
| 2D Object Detection | COCO minival | AP75 | 45.6 | R3-CNN (ResNet-50-FPN, GRoIE) |
| 2D Object Detection | COCO minival | APS | 24.4 | R3-CNN (ResNet-50-FPN, GRoIE) |
| 16k | COCO minival | AP50 | 64.3 | R3-CNN (ResNet-50-FPN, DCN) |
| 16k | COCO minival | AP75 | 48.9 | R3-CNN (ResNet-50-FPN, DCN) |
| 16k | COCO minival | APL | 59.6 | R3-CNN (ResNet-50-FPN, DCN) |
| 16k | COCO minival | APM | 48.3 | R3-CNN (ResNet-50-FPN, DCN) |
| 16k | COCO minival | APS | 26.6 | R3-CNN (ResNet-50-FPN, DCN) |
| 16k | COCO minival | box AP | 44.8 | R3-CNN (ResNet-50-FPN, DCN) |
| 16k | COCO minival | AP50 | 64.1 | R3-CNN (ResNet-50-FPN, GC-Net) |
| 16k | COCO minival | AP75 | 48.4 | R3-CNN (ResNet-50-FPN, GC-Net) |
| 16k | COCO minival | APL | 58.9 | R3-CNN (ResNet-50-FPN, GC-Net) |
| 16k | COCO minival | APM | 47.1 | R3-CNN (ResNet-50-FPN, GC-Net) |
| 16k | COCO minival | APS | 27 | R3-CNN (ResNet-50-FPN, GC-Net) |
| 16k | COCO minival | box AP | 44.3 | R3-CNN (ResNet-50-FPN, GC-Net) |
| 16k | COCO minival | AP50 | 61 | R3-CNN (ResNet-50-FPN) |
| 16k | COCO minival | AP75 | 46.3 | R3-CNN (ResNet-50-FPN) |
| 16k | COCO minival | APL | 55.7 | R3-CNN (ResNet-50-FPN) |
| 16k | COCO minival | APM | 45.2 | R3-CNN (ResNet-50-FPN) |
| 16k | COCO minival | APS | 24.5 | R3-CNN (ResNet-50-FPN) |
| 16k | COCO minival | box AP | 42 | R3-CNN (ResNet-50-FPN) |
| 16k | COCO minival | AP50 | 61.2 | R3-CNN (ResNet-50-FPN, GRoIE) |
| 16k | COCO minival | AP75 | 45.6 | R3-CNN (ResNet-50-FPN, GRoIE) |
| 16k | COCO minival | APS | 24.4 | R3-CNN (ResNet-50-FPN, GRoIE) |