Metric: AP (higher is better)
| # | Model↕ | AP▼ | Extra Data | Paper | Date↕ | Code |
|---|---|---|---|---|---|---|
| 1 | ViT-P (OneFormer, InternImage-H) | 50.6 | No | The Missing Point in Vision Transformers for Uni... | 2025-05-26 | Code |
| 2 | OneFormer (ConvNeXt-L, single-scale, 512x1024, Mapillary Vistas-pretrained) | 48.7 | Yes | OneFormer: One Transformer to Rule Universal Ima... | 2022-11-10 | Code |
| 3 | Panoptic-DeepLab (SWideRNet [1, 1, 4.5], Mapillary Vistas, multi-scale) | 46.8 | Yes | Scaling Wide Residual Networks for Panoptic Segm... | 2020-11-23 | - |
| 4 | OneFormer (ConvNeXt-XL, single-scale) | 46.7 | No | OneFormer: One Transformer to Rule Universal Ima... | 2022-11-10 | Code |
| 5 | OneFormer (ConvNeXt-L, single-scale) | 46.5 | No | OneFormer: One Transformer to Rule Universal Ima... | 2022-11-10 | Code |
| 6 | AFF-Base (single-scale, point-based Mask2Former) | 46.2 | No | AutoFocusFormer: Image Segmentation off the Grid | 2023-04-24 | Code |
| 7 | OneFormer (DiNAT-L, single-scale) | 45.6 | No | OneFormer: One Transformer to Rule Universal Ima... | 2022-11-10 | Code |
| 8 | OneFormer (Swin-L, single-scale) | 45.6 | No | OneFormer: One Transformer to Rule Universal Ima... | 2022-11-10 | Code |
| 9 | DiNAT-L (Mask2Former) | 44.5 | No | Dilated Neighborhood Attention Transformer | 2022-09-29 | Code |
| 10 | Axial-DeepLab-XL (Mapillary Vistas, multi-scale) | 44.2 | Yes | Axial-DeepLab: Stand-Alone Axial-Attention for P... | 2020-03-17 | Code |
| 11 | AFF-Small (single-scale, point-based Mask2Former) | 44.2 | No | AutoFocusFormer: Image Segmentation off the Grid | 2023-04-24 | Code |
| 12 | kMaX-DeepLab (single-scale) | 44 | No | kMaX-DeepLab: k-means Mask Transformer | 2022-07-08 | Code |
| 13 | Mask2Former (Swin-L) | 43.6 | No | Masked-attention Mask Transformer for Universal ... | 2021-12-02 | Code |
| 14 | EfficientPS | 43.5 | Yes | EfficientPS: Efficient Panoptic Segmentation | 2020-04-05 | Code |
| 15 | Panoptic-DeepLab (SWideRNet [1, 1, 4.5], Mapillary Vistas, single-scale) | 42.8 | Yes | Scaling Wide Residual Networks for Panoptic Segm... | 2020-11-23 | - |
| 16 | EfficientPS (Cityscapes-fine) | 39.1 | No | EfficientPS: Efficient Panoptic Segmentation | 2020-04-05 | Code |
| 17 | UPSNet (ResNet-101, multiscale) | 39 | Yes | UPSNet: A Unified Panoptic Segmentation Network | 2019-01-12 | Code |
| 18 | TASCNet (ResNet-50, multi-scale) | 39 | Yes | Learning to Fuse Things and Stuff | 2018-12-04 | - |
| 19 | Panoptic-DeepLab (X71) | 38.5 | Yes | Panoptic-DeepLab: A Simple, Strong, and Fast Bas... | 2019-11-22 | Code |
| 20 | UPSNet (ResNet-101) | 37.8 | Yes | UPSNet: A Unified Panoptic Segmentation Network | 2019-01-12 | Code |
| 21 | TASCNet (ResNet-50) | 37.6 | Yes | Learning to Fuse Things and Stuff | 2018-12-04 | - |
| 22 | MRCNN + PSPNet (ResNet-101) | 36.4 | Yes | Panoptic Segmentation | 2018-01-03 | Code |
| 23 | AdaptIS (ResNeXt-101) | 36.3 | No | AdaptIS: Adaptive Instance Selection Network | 2019-09-17 | - |
| 24 | AUNet (ResNet-101-FPN) | 34.4 | No | Attention-guided Unified Network for Panoptic Se... | 2018-12-10 | - |
| 25 | COPS (ResNet-50) | 34.1 | No | Combinatorial Optimization for Panoptic Segmenta... | 2021-06-06 | Code |
| 26 | AdaptIS (ResNet-101) | 33.9 | No | AdaptIS: Adaptive Instance Selection Network | 2019-09-17 | - |
| 27 | UPSNet (ResNet-50) | 33.3 | No | UPSNet: A Unified Panoptic Segmentation Network | 2019-01-12 | Code |
| 28 | Panoptic FPN (ResNet-101) | 33 | No | Panoptic Feature Pyramid Networks | 2019-01-08 | Code |
| 29 | AdaptIS (ResNet-50) | 32.3 | No | AdaptIS: Adaptive Instance Selection Network | 2019-09-17 | - |
| 30 | Dynamically Instantiated Network (ResNet-101) | 28.6 | No | Weakly- and Semi-Supervised Panoptic Segmentation | 2018-08-10 | Code |