Metric: PQst (higher is better)
| # | Model↕ | PQst▼ | Extra Data | Paper | Date↕ | Code |
|---|---|---|---|---|---|---|
| 1 | OneFormer (InternImage-H,single-scale) | 49.2 | No | OneFormer: One Transformer to Rule Universal Ima... | 2022-11-10 | Code |
| 2 | DiNAT-L (single-scale, Mask2Former) | 48.8 | No | Dilated Neighborhood Attention Transformer | 2022-09-29 | Code |
| 3 | kMaX-DeepLab (single-scale, pseudo-labels) | 48.8 | Yes | kMaX-DeepLab: k-means Mask Transformer | 2022-07-08 | Code |
| 4 | kMaX-DeepLab (single-scale, drop query with 256 queries) | 48.6 | No | kMaX-DeepLab: k-means Mask Transformer | 2022-07-08 | Code |
| 5 | kMaX-DeepLab (single-scale) | 48.6 | No | kMaX-DeepLab: k-means Mask Transformer | 2022-07-08 | Code |
| 6 | ViT-Adapter-L (single-scale, BEiTv2 pretrain, Mask2Former) | 48.4 | No | Vision Transformer Adapter for Dense Predictions | 2022-05-17 | Code |
| 7 | OneFormer (DiNAT-L, single-scale) | 48.4 | No | OneFormer: One Transformer to Rule Universal Ima... | 2022-11-10 | Code |
| 8 | Visual Attention Network (VAN-B6 + Mask2Former) | 48.2 | No | Visual Attention Network | 2022-02-20 | Code |
| 9 | Mask2Former (single-scale) | 48.1 | No | Masked-attention Mask Transformer for Universal ... | 2021-12-02 | Code |
| 10 | OneFormer (Swin-L, single-scale) | 48 | No | OneFormer: One Transformer to Rule Universal Ima... | 2022-11-10 | Code |
| 11 | Panoptic SegFormer (single-scale) | 46.9 | No | Panoptic SegFormer: Delving Deeper into Panoptic... | 2021-09-08 | Code |
| 12 | CMT-DeepLab (single-scale) | 46.6 | No | CMT-DeepLab: Clustering Mask Transformers for Pa... | 2022-06-17 | Code |
| 13 | MaskFormer (single-scale) | 44 | No | Per-Pixel Classification is Not All You Need for... | 2021-07-13 | Code |
| 14 | Panoptic SegFormer (ResNet-101) | 43.2 | No | Panoptic SegFormer: Delving Deeper into Panoptic... | 2021-09-08 | Code |
| 15 | MaX-DeepLab-L (single-scale) | 42.2 | No | MaX-DeepLab: End-to-End Panoptic Segmentation wi... | 2020-12-01 | Code |
| 16 | PanopticFPN+ResNeSt(single-scale) | 37 | No | ResNeSt: Split-Attention Networks | 2020-04-19 | Code |
| 17 | DETR-R101 (ResNet-101) | 37 | No | End-to-End Object Detection with Transformers | 2020-05-26 | Code |
| 18 | Axial-DeepLab-L(multi-scale) | 36.8 | No | Axial-DeepLab: Stand-Alone Axial-Attention for P... | 2020-03-17 | Code |
| 19 | Panoptic FCN* (ResNet-50-FPN) | 35.6 | No | Fully Convolutional Networks for Panoptic Segmen... | 2020-12-01 | Code |
| 20 | Axial-DeepLab-L (single-scale) | 35.6 | No | Axial-DeepLab: Stand-Alone Axial-Attention for P... | 2020-03-17 | Code |
| 21 | PanopticFPN++ | 33.6 | No | End-to-End Object Detection with Transformers | 2020-05-26 | Code |