Shiyi Lan, Zhiding Yu, Christopher Choy, Subhashree Radhakrishnan, Guilin Liu, Yuke Zhu, Larry S. Davis, Anima Anandkumar
We introduce DiscoBox, a novel framework that jointly learns instance segmentation and semantic correspondence using bounding box supervision. Specifically, we propose a self-ensembling framework where instance segmentation and semantic correspondence are jointly guided by a structured teacher in addition to the bounding box supervision. The teacher is a structured energy model incorporating a pairwise potential and a cross-image potential to model the pairwise pixel relationships both within and across the boxes. Minimizing the teacher energy simultaneously yields refined object masks and dense correspondences between intra-class objects, which are taken as pseudo-labels to supervise the task network and provide positive/negative correspondence pairs for dense constrastive learning. We show a symbiotic relationship where the two tasks mutually benefit from each other. Our best model achieves 37.9% AP on COCO instance segmentation, surpassing prior weakly supervised methods and is competitive to supervised methods. We also obtain state of the art weakly supervised results on PASCAL VOC12 and PF-PASCAL with real-time inference.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Weakly-supervised instance segmentation | COCO 2017 val | AP | 31.4 | DiscoBox (ResNet-50) |
| Weakly-supervised instance segmentation | COCO 2017 val | AP@50 | 52.6 | DiscoBox (ResNet-50) |
| Weakly-supervised instance segmentation | COCO 2017 val | AP@75 | 32.2 | DiscoBox (ResNet-50) |
| Weakly-supervised instance segmentation | COCO 2017 val | AP@L | 50.1 | DiscoBox (ResNet-50) |
| Weakly-supervised instance segmentation | COCO 2017 val | AP@M | 33.8 | DiscoBox (ResNet-50) |
| Weakly-supervised instance segmentation | COCO 2017 val | AP@S | 11.5 | DiscoBox (ResNet-50) |
| Weakly-supervised instance segmentation | COCO test-dev | AP | 37.9 | DiscoBox (ResNeXt-101-DCN-FPN) |
| Weakly-supervised instance segmentation | COCO test-dev | AP@50 | 61.4 | DiscoBox (ResNeXt-101-DCN-FPN) |
| Weakly-supervised instance segmentation | COCO test-dev | AP@75 | 40 | DiscoBox (ResNeXt-101-DCN-FPN) |
| Weakly-supervised instance segmentation | COCO test-dev | AP@L | 53.9 | DiscoBox (ResNeXt-101-DCN-FPN) |
| Weakly-supervised instance segmentation | COCO test-dev | AP@M | 41.1 | DiscoBox (ResNeXt-101-DCN-FPN) |
| Weakly-supervised instance segmentation | COCO test-dev | AP@S | 18 | DiscoBox (ResNeXt-101-DCN-FPN) |
| Weakly-supervised instance segmentation | COCO test-dev | AP | 35.8 | DiscoBox (ResNet-101-DCN-FPN) |
| Weakly-supervised instance segmentation | COCO test-dev | AP@50 | 59.8 | DiscoBox (ResNet-101-DCN-FPN) |
| Weakly-supervised instance segmentation | COCO test-dev | AP@75 | 36.4 | DiscoBox (ResNet-101-DCN-FPN) |
| Weakly-supervised instance segmentation | COCO test-dev | AP@L | 52.1 | DiscoBox (ResNet-101-DCN-FPN) |
| Weakly-supervised instance segmentation | COCO test-dev | AP@M | 38.7 | DiscoBox (ResNet-101-DCN-FPN) |
| Weakly-supervised instance segmentation | COCO test-dev | AP@S | 16.9 | DiscoBox (ResNet-101-DCN-FPN) |
| Weakly-supervised instance segmentation | COCO test-dev | AP | 32 | DiscoBox (ResNet-50-FPN) |
| Weakly-supervised instance segmentation | COCO test-dev | AP@50 | 53.6 | DiscoBox (ResNet-50-FPN) |
| Weakly-supervised instance segmentation | COCO test-dev | AP@75 | 32.6 | DiscoBox (ResNet-50-FPN) |
| Weakly-supervised instance segmentation | COCO test-dev | AP@L | 48.4 | DiscoBox (ResNet-50-FPN) |
| Weakly-supervised instance segmentation | COCO test-dev | AP@M | 33.7 | DiscoBox (ResNet-50-FPN) |
| Weakly-supervised instance segmentation | COCO test-dev | AP@S | 11.7 | DiscoBox (ResNet-50-FPN) |
| Instance Segmentation | COCO test-dev | mask AP | 37.9 | DiscoBox |
| Instance Segmentation | PASCAL VOC 2012 val | AP_25 | 75.2 | DiscoBox |
| Instance Segmentation | PASCAL VOC 2012 val | AP_50 | 63.6 | DiscoBox |
| Instance Segmentation | PASCAL VOC 2012 val | AP_70 | 45.5 | DiscoBox |
| Instance Segmentation | PASCAL VOC 2012 val | AP_75 | 37.5 | DiscoBox |