Tasks
SotA
Datasets
Papers
Methods
Submit
About
SotA
/
Audio
/
10-shot image generation
/
Cityscapes val
10-shot image generation on Cityscapes val
Metric: PQ (higher is better)
Leaderboard
Dataset
Loading chart...
Results
Submit a result
Hide extra data
Export CSV
Sort:
PQ (best first)
PQ (worst first)
Date (newest first)
Date (oldest first)
Model name (A→Z)
#
Model
↕
PQ
▼
Extra Data
Paper
Date
↕
Code
1
ViT-P (OneFormer, InternImage-H)
70.8
No
The Missing Point in Vision Transformers for Uni...
2025-05-26
Code
2
OneFormer (ConvNeXt-L, single-scale, 512x1024, Mapillary Vistas-pretrained)
70.1
Yes
OneFormer: One Transformer to Rule Universal Ima...
2022-11-10
Code
3
Panoptic-DeepLab (SWideRNet [1, 1, 4.5], Mapillary Vistas, multi-scale)
69.6
Yes
Scaling Wide Residual Networks for Panoptic Segm...
2020-11-23
-
4
OneFormer (ConvNeXt-L, single-scale)
68.51
No
OneFormer: One Transformer to Rule Universal Ima...
2022-11-10
Code
5
Axial-DeepLab-XL (Mapillary Vistas, multi-scale)
68.5
Yes
Axial-DeepLab: Stand-Alone Axial-Attention for P...
2020-03-17
Code
6
Panoptic-DeepLab (SWideRNet [1, 1, 4.5], Mapillary Vistas, single-scale)
68.5
Yes
Scaling Wide Residual Networks for Panoptic Segm...
2020-11-23
-
7
OneFormer (ConvNeXt-XL, single-scale)
68.4
No
OneFormer: One Transformer to Rule Universal Ima...
2022-11-10
Code
8
kMaX-DeepLab (single-scale)
68.4
No
kMaX-DeepLab: k-means Mask Transformer
2022-07-08
Code
9
AFF-Base (single-scale, point-based Mask2Former)
67.7
No
AutoFocusFormer: Image Segmentation off the Grid
2023-04-24
Code
10
OneFormer (DiNAT-L, single-scale)
67.6
No
OneFormer: One Transformer to Rule Universal Ima...
2022-11-10
Code
11
EfficientPS
67.5
Yes
EfficientPS: Efficient Panoptic Segmentation
2020-04-05
Code
12
DiNAT-L (Mask2Former)
67.2
No
Dilated Neighborhood Attention Transformer
2022-09-29
Code
13
OneFormer (Swin-L, single-scale)
67.2
No
OneFormer: One Transformer to Rule Universal Ima...
2022-11-10
Code
14
AFF-Small (single-scale, point-based Mask2Former)
66.9
No
AutoFocusFormer: Image Segmentation off the Grid
2023-04-24
Code
15
Mask2Former (Swin-L)
66.6
No
Masked-attention Mask Transformer for Universal ...
2021-12-02
Code
16
EfficientPS (Cityscapes-fine)
64.9
No
EfficientPS: Efficient Panoptic Segmentation
2020-04-05
Code
17
CMT-DeepLab (MaX-S, single-scale, IN-1K)
64.6
No
CMT-DeepLab: Clustering Mask Transformers for Pa...
2022-06-17
Code
18
Panoptic-DeepLab (X71)
64.1
Yes
Panoptic-DeepLab: A Simple, Strong, and Fast Bas...
2019-11-22
Code
19
Mask2Former + Intra-Batch Supervision (ResNet-50)
62.4
No
Intra-Batch Supervision for Panoptic Segmentatio...
2023-04-17
Code
20
COPS (ResNet-50)
62.1
No
Combinatorial Optimization for Panoptic Segmenta...
2021-06-06
Code
21
AdaptIS (ResNeXt-101)
62
No
AdaptIS: Adaptive Instance Selection Network
2019-09-17
-
22
UPSNet (ResNet-101, multiscale)
61.8
Yes
UPSNet: A Unified Panoptic Segmentation Network
2019-01-12
Code
23
Panoptic FCN* (ResNet-FPN)
61.4
No
Fully Convolutional Networks for Panoptic Segmen...
2020-12-01
Code
24
MRCNN + PSPNet (ResNet-101)
61.2
Yes
Panoptic Segmentation
2018-01-03
Code
25
AdaptIS (ResNet-101)
60.6
No
AdaptIS: Adaptive Instance Selection Network
2019-09-17
-
26
UPSNet (ResNet-101)
60.5
Yes
UPSNet: A Unified Panoptic Segmentation Network
2019-01-12
Code
27
TASCNet (ResNet-50, multi-scale)
60.4
Yes
Learning to Fuse Things and Stuff
2018-12-04
-
28
UPSNet (ResNet-50)
59.3
No
UPSNet: A Unified Panoptic Segmentation Network
2019-01-12
Code
29
TASCNet (ResNet-50)
59.2
Yes
Learning to Fuse Things and Stuff
2018-12-04
-
30
AUNet (ResNet-101-FPN)
59
No
Attention-guided Unified Network for Panoptic Se...
2018-12-10
-
31
AdaptIS (ResNet-50)
59
No
AdaptIS: Adaptive Instance Selection Network
2019-09-17
-
32
Panoptic FPN (ResNet-101)
58.1
No
Panoptic Feature Pyramid Networks
2019-01-08
Code
33
DeeperLab (Xception-71)
56.5
No
DeeperLab: Single-Shot Image Parser
2019-02-13
-
34
Dynamically Instantiated Network (ResNet-101)
53.8
No
Weakly- and Semi-Supervised Panoptic Segmentation
2018-08-10
Code
#1
ViT-P (OneFormer, InternImage-H)
SOTA
70.8
PQ
· 2025-05-26
The Missing Point in Vision Transformers for Universal Image Segmentation
Code
#2
OneFormer (ConvNeXt-L, single-scale, 512x1024, Mapillary Vistas-pretrained)
SOTA
70.1
PQ
· Extra Data
· 2022-11-10
OneFormer: One Transformer to Rule Universal Image Segmentation
Code
#3
Panoptic-DeepLab (SWideRNet [1, 1, 4.5], Mapillary Vistas, multi-scale)
SOTA
69.6
PQ
· Extra Data
· 2020-11-23
Scaling Wide Residual Networks for Panoptic Segmentation
#4
OneFormer (ConvNeXt-L, single-scale)
68.51
PQ
· 2022-11-10
OneFormer: One Transformer to Rule Universal Image Segmentation
Code
#5
Axial-DeepLab-XL (Mapillary Vistas, multi-scale)
SOTA
68.5
PQ
· Extra Data
· 2020-03-17
Axial-DeepLab: Stand-Alone Axial-Attention for Panoptic Segmentation
Code
#6
Panoptic-DeepLab (SWideRNet [1, 1, 4.5], Mapillary Vistas, single-scale)
68.5
PQ
· Extra Data
· 2020-11-23
Scaling Wide Residual Networks for Panoptic Segmentation
#7
OneFormer (ConvNeXt-XL, single-scale)
68.4
PQ
· 2022-11-10
OneFormer: One Transformer to Rule Universal Image Segmentation
Code
#8
kMaX-DeepLab (single-scale)
68.4
PQ
· 2022-07-08
kMaX-DeepLab: k-means Mask Transformer
Code
#9
AFF-Base (single-scale, point-based Mask2Former)
67.7
PQ
· 2023-04-24
AutoFocusFormer: Image Segmentation off the Grid
Code
#10
OneFormer (DiNAT-L, single-scale)
67.6
PQ
· 2022-11-10
OneFormer: One Transformer to Rule Universal Image Segmentation
Code
#11
EfficientPS
67.5
PQ
· Extra Data
· 2020-04-05
EfficientPS: Efficient Panoptic Segmentation
Code
#12
DiNAT-L (Mask2Former)
67.2
PQ
· 2022-09-29
Dilated Neighborhood Attention Transformer
Code
#13
OneFormer (Swin-L, single-scale)
67.2
PQ
· 2022-11-10
OneFormer: One Transformer to Rule Universal Image Segmentation
Code
#14
AFF-Small (single-scale, point-based Mask2Former)
66.9
PQ
· 2023-04-24
AutoFocusFormer: Image Segmentation off the Grid
Code
#15
Mask2Former (Swin-L)
66.6
PQ
· 2021-12-02
Masked-attention Mask Transformer for Universal Image Segmentation
Code
#16
EfficientPS (Cityscapes-fine)
64.9
PQ
· 2020-04-05
EfficientPS: Efficient Panoptic Segmentation
Code
#17
CMT-DeepLab (MaX-S, single-scale, IN-1K)
64.6
PQ
· 2022-06-17
CMT-DeepLab: Clustering Mask Transformers for Panoptic Segmentation
Code
#18
Panoptic-DeepLab (X71)
SOTA
64.1
PQ
· Extra Data
· 2019-11-22
Panoptic-DeepLab: A Simple, Strong, and Fast Baseline for Bottom-Up Panoptic Segmentation
Code
#19
Mask2Former + Intra-Batch Supervision (ResNet-50)
62.4
PQ
· 2023-04-17
Intra-Batch Supervision for Panoptic Segmentation on High-Resolution Images
Code
#20
COPS (ResNet-50)
62.1
PQ
· 2021-06-06
Combinatorial Optimization for Panoptic Segmentation: A Fully Differentiable Approach
Code
#21
AdaptIS (ResNeXt-101)
SOTA
62
PQ
· 2019-09-17
AdaptIS: Adaptive Instance Selection Network
#22
UPSNet (ResNet-101, multiscale)
SOTA
61.8
PQ
· Extra Data
· 2019-01-12
UPSNet: A Unified Panoptic Segmentation Network
Code
#23
Panoptic FCN* (ResNet-FPN)
61.4
PQ
· 2020-12-01
Fully Convolutional Networks for Panoptic Segmentation
Code
#24
MRCNN + PSPNet (ResNet-101)
SOTA
61.2
PQ
· Extra Data
· 2018-01-03
Panoptic Segmentation
Code
#25
AdaptIS (ResNet-101)
60.6
PQ
· 2019-09-17
AdaptIS: Adaptive Instance Selection Network
#26
UPSNet (ResNet-101)
60.5
PQ
· Extra Data
· 2019-01-12
UPSNet: A Unified Panoptic Segmentation Network
Code
#27
TASCNet (ResNet-50, multi-scale)
60.4
PQ
· Extra Data
· 2018-12-04
Learning to Fuse Things and Stuff
#28
UPSNet (ResNet-50)
59.3
PQ
· 2019-01-12
UPSNet: A Unified Panoptic Segmentation Network
Code
#29
TASCNet (ResNet-50)
59.2
PQ
· Extra Data
· 2018-12-04
Learning to Fuse Things and Stuff
#30
AUNet (ResNet-101-FPN)
59
PQ
· 2018-12-10
Attention-guided Unified Network for Panoptic Segmentation
#31
AdaptIS (ResNet-50)
59
PQ
· 2019-09-17
AdaptIS: Adaptive Instance Selection Network
#32
Panoptic FPN (ResNet-101)
58.1
PQ
· 2019-01-08
Panoptic Feature Pyramid Networks
Code
#33
DeeperLab (Xception-71)
56.5
PQ
· 2019-02-13
DeeperLab: Single-Shot Image Parser
#34
Dynamically Instantiated Network (ResNet-101)
53.8
PQ
· 2018-08-10
Weakly- and Semi-Supervised Panoptic Segmentation
Code