Jie Li, Allan Raventos, Arjun Bhargava, Takaaki Tagawa, Adrien Gaidon
We propose an end-to-end learning approach for panoptic segmentation, a novel task unifying instance (things) and semantic (stuff) segmentation. Our model, TASCNet, uses feature maps from a shared backbone network to predict in a single feed-forward pass both things and stuff segmentations. We explicitly constrain these two output distributions through a global things and stuff binary mask to enforce cross-task consistency. Our proposed unified network is competitive with the state of the art on several benchmarks for panoptic segmentation as well as on the individual semantic and instance segmentation tasks.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Semantic Segmentation | Cityscapes val | AP | 39 | TASCNet (ResNet-50, multi-scale) |
| Semantic Segmentation | Cityscapes val | PQ | 60.4 | TASCNet (ResNet-50, multi-scale) |
| Semantic Segmentation | Cityscapes val | PQst | 63.3 | TASCNet (ResNet-50, multi-scale) |
| Semantic Segmentation | Cityscapes val | PQth | 56.1 | TASCNet (ResNet-50, multi-scale) |
| Semantic Segmentation | Cityscapes val | mIoU | 78 | TASCNet (ResNet-50, multi-scale) |
| Semantic Segmentation | Cityscapes val | AP | 37.6 | TASCNet (ResNet-50) |
| Semantic Segmentation | Cityscapes val | PQ | 59.2 | TASCNet (ResNet-50) |
| Semantic Segmentation | Cityscapes val | PQst | 61.5 | TASCNet (ResNet-50) |
| Semantic Segmentation | Cityscapes val | PQth | 56 | TASCNet (ResNet-50) |
| Semantic Segmentation | Cityscapes val | mIoU | 77.8 | TASCNet (ResNet-50) |
| Semantic Segmentation | COCO test-dev | PQ | 40.7 | TASCNet |
| Semantic Segmentation | COCO test-dev | PQst | 31 | TASCNet |
| Semantic Segmentation | COCO test-dev | PQth | 47 | TASCNet |
| 10-shot image generation | Cityscapes val | AP | 39 | TASCNet (ResNet-50, multi-scale) |
| 10-shot image generation | Cityscapes val | PQ | 60.4 | TASCNet (ResNet-50, multi-scale) |
| 10-shot image generation | Cityscapes val | PQst | 63.3 | TASCNet (ResNet-50, multi-scale) |
| 10-shot image generation | Cityscapes val | PQth | 56.1 | TASCNet (ResNet-50, multi-scale) |
| 10-shot image generation | Cityscapes val | mIoU | 78 | TASCNet (ResNet-50, multi-scale) |
| 10-shot image generation | Cityscapes val | AP | 37.6 | TASCNet (ResNet-50) |
| 10-shot image generation | Cityscapes val | PQ | 59.2 | TASCNet (ResNet-50) |
| 10-shot image generation | Cityscapes val | PQst | 61.5 | TASCNet (ResNet-50) |
| 10-shot image generation | Cityscapes val | PQth | 56 | TASCNet (ResNet-50) |
| 10-shot image generation | Cityscapes val | mIoU | 77.8 | TASCNet (ResNet-50) |
| 10-shot image generation | COCO test-dev | PQ | 40.7 | TASCNet |
| 10-shot image generation | COCO test-dev | PQst | 31 | TASCNet |
| 10-shot image generation | COCO test-dev | PQth | 47 | TASCNet |
| Panoptic Segmentation | Cityscapes val | AP | 39 | TASCNet (ResNet-50, multi-scale) |
| Panoptic Segmentation | Cityscapes val | PQ | 60.4 | TASCNet (ResNet-50, multi-scale) |
| Panoptic Segmentation | Cityscapes val | PQst | 63.3 | TASCNet (ResNet-50, multi-scale) |
| Panoptic Segmentation | Cityscapes val | PQth | 56.1 | TASCNet (ResNet-50, multi-scale) |
| Panoptic Segmentation | Cityscapes val | mIoU | 78 | TASCNet (ResNet-50, multi-scale) |
| Panoptic Segmentation | Cityscapes val | AP | 37.6 | TASCNet (ResNet-50) |
| Panoptic Segmentation | Cityscapes val | PQ | 59.2 | TASCNet (ResNet-50) |
| Panoptic Segmentation | Cityscapes val | PQst | 61.5 | TASCNet (ResNet-50) |
| Panoptic Segmentation | Cityscapes val | PQth | 56 | TASCNet (ResNet-50) |
| Panoptic Segmentation | Cityscapes val | mIoU | 77.8 | TASCNet (ResNet-50) |
| Panoptic Segmentation | COCO test-dev | PQ | 40.7 | TASCNet |
| Panoptic Segmentation | COCO test-dev | PQst | 31 | TASCNet |
| Panoptic Segmentation | COCO test-dev | PQth | 47 | TASCNet |