10-shot image generation on Cityscapes val

Metric: PQ (higher is better)

LeaderboardDataset

Loading chart...

Results

Hide extra data

Sort:

#	Model↕	PQ▼	Extra Data	Paper	Date↕	Code
1	ViT-P (OneFormer, InternImage-H)	70.8	No	The Missing Point in Vision Transformers for Uni...	2025-05-26	Code
2	OneFormer (ConvNeXt-L, single-scale, 512x1024, Mapillary Vistas-pretrained)	70.1	Yes	OneFormer: One Transformer to Rule Universal Ima...	2022-11-10	Code
3	Panoptic-DeepLab (SWideRNet [1, 1, 4.5], Mapillary Vistas, multi-scale)	69.6	Yes	Scaling Wide Residual Networks for Panoptic Segm...	2020-11-23	-
4	OneFormer (ConvNeXt-L, single-scale)	68.51	No	OneFormer: One Transformer to Rule Universal Ima...	2022-11-10	Code
5	Axial-DeepLab-XL (Mapillary Vistas, multi-scale)	68.5	Yes	Axial-DeepLab: Stand-Alone Axial-Attention for P...	2020-03-17	Code
6	Panoptic-DeepLab (SWideRNet [1, 1, 4.5], Mapillary Vistas, single-scale)	68.5	Yes	Scaling Wide Residual Networks for Panoptic Segm...	2020-11-23	-
7	OneFormer (ConvNeXt-XL, single-scale)	68.4	No	OneFormer: One Transformer to Rule Universal Ima...	2022-11-10	Code
8	kMaX-DeepLab (single-scale)	68.4	No	kMaX-DeepLab: k-means Mask Transformer	2022-07-08	Code
9	AFF-Base (single-scale, point-based Mask2Former)	67.7	No	AutoFocusFormer: Image Segmentation off the Grid	2023-04-24	Code
10	OneFormer (DiNAT-L, single-scale)	67.6	No	OneFormer: One Transformer to Rule Universal Ima...	2022-11-10	Code
11	EfficientPS	67.5	Yes	EfficientPS: Efficient Panoptic Segmentation	2020-04-05	Code
12	DiNAT-L (Mask2Former)	67.2	No	Dilated Neighborhood Attention Transformer	2022-09-29	Code
13	OneFormer (Swin-L, single-scale)	67.2	No	OneFormer: One Transformer to Rule Universal Ima...	2022-11-10	Code
14	AFF-Small (single-scale, point-based Mask2Former)	66.9	No	AutoFocusFormer: Image Segmentation off the Grid	2023-04-24	Code
15	Mask2Former (Swin-L)	66.6	No	Masked-attention Mask Transformer for Universal ...	2021-12-02	Code
16	EfficientPS (Cityscapes-fine)	64.9	No	EfficientPS: Efficient Panoptic Segmentation	2020-04-05	Code
17	CMT-DeepLab (MaX-S, single-scale, IN-1K)	64.6	No	CMT-DeepLab: Clustering Mask Transformers for Pa...	2022-06-17	Code
18	Panoptic-DeepLab (X71)	64.1	Yes	Panoptic-DeepLab: A Simple, Strong, and Fast Bas...	2019-11-22	Code
19	Mask2Former + Intra-Batch Supervision (ResNet-50)	62.4	No	Intra-Batch Supervision for Panoptic Segmentatio...	2023-04-17	Code
20	COPS (ResNet-50)	62.1	No	Combinatorial Optimization for Panoptic Segmenta...	2021-06-06	Code
21	AdaptIS (ResNeXt-101)	62	No	AdaptIS: Adaptive Instance Selection Network	2019-09-17	-
22	UPSNet (ResNet-101, multiscale)	61.8	Yes	UPSNet: A Unified Panoptic Segmentation Network	2019-01-12	Code
23	Panoptic FCN* (ResNet-FPN)	61.4	No	Fully Convolutional Networks for Panoptic Segmen...	2020-12-01	Code
24	MRCNN + PSPNet (ResNet-101)	61.2	Yes	Panoptic Segmentation	2018-01-03	Code
25	AdaptIS (ResNet-101)	60.6	No	AdaptIS: Adaptive Instance Selection Network	2019-09-17	-
26	UPSNet (ResNet-101)	60.5	Yes	UPSNet: A Unified Panoptic Segmentation Network	2019-01-12	Code
27	TASCNet (ResNet-50, multi-scale)	60.4	Yes	Learning to Fuse Things and Stuff	2018-12-04	-
28	UPSNet (ResNet-50)	59.3	No	UPSNet: A Unified Panoptic Segmentation Network	2019-01-12	Code
29	TASCNet (ResNet-50)	59.2	Yes	Learning to Fuse Things and Stuff	2018-12-04	-
30	AUNet (ResNet-101-FPN)	59	No	Attention-guided Unified Network for Panoptic Se...	2018-12-10	-
31	AdaptIS (ResNet-50)	59	No	AdaptIS: Adaptive Instance Selection Network	2019-09-17	-
32	Panoptic FPN (ResNet-101)	58.1	No	Panoptic Feature Pyramid Networks	2019-01-08	Code
33	DeeperLab (Xception-71)	56.5	No	DeeperLab: Single-Shot Image Parser	2019-02-13	-
34	Dynamically Instantiated Network (ResNet-101)	53.8	No	Weakly- and Semi-Supervised Panoptic Segmentation	2018-08-10	Code

#1ViT-P (OneFormer, InternImage-H)SOTA
70.8
PQ· 2025-05-26
The Missing Point in Vision Transformers for Universal Image Segmentation Code
#2OneFormer (ConvNeXt-L, single-scale, 512x1024, Mapillary Vistas-pretrained)SOTA
70.1
PQ· Extra Data· 2022-11-10
OneFormer: One Transformer to Rule Universal Image Segmentation Code
#3Panoptic-DeepLab (SWideRNet [1, 1, 4.5], Mapillary Vistas, multi-scale)SOTA
69.6
PQ· Extra Data· 2020-11-23
Scaling Wide Residual Networks for Panoptic Segmentation
#4OneFormer (ConvNeXt-L, single-scale)
68.51
PQ· 2022-11-10
OneFormer: One Transformer to Rule Universal Image Segmentation Code
#5Axial-DeepLab-XL (Mapillary Vistas, multi-scale)SOTA
68.5
PQ· Extra Data· 2020-03-17
Axial-DeepLab: Stand-Alone Axial-Attention for Panoptic Segmentation Code
#6Panoptic-DeepLab (SWideRNet [1, 1, 4.5], Mapillary Vistas, single-scale)
68.5
PQ· Extra Data· 2020-11-23
Scaling Wide Residual Networks for Panoptic Segmentation
#7OneFormer (ConvNeXt-XL, single-scale)
68.4
PQ· 2022-11-10
OneFormer: One Transformer to Rule Universal Image Segmentation Code
#8kMaX-DeepLab (single-scale)
68.4
PQ· 2022-07-08
kMaX-DeepLab: k-means Mask Transformer Code
#9AFF-Base (single-scale, point-based Mask2Former)
67.7
PQ· 2023-04-24
AutoFocusFormer: Image Segmentation off the Grid Code
#10OneFormer (DiNAT-L, single-scale)
67.6
PQ· 2022-11-10
OneFormer: One Transformer to Rule Universal Image Segmentation Code
#11EfficientPS
67.5
PQ· Extra Data· 2020-04-05
EfficientPS: Efficient Panoptic Segmentation Code
#12DiNAT-L (Mask2Former)
67.2
PQ· 2022-09-29
Dilated Neighborhood Attention Transformer Code
#13OneFormer (Swin-L, single-scale)
67.2
PQ· 2022-11-10
OneFormer: One Transformer to Rule Universal Image Segmentation Code
#14AFF-Small (single-scale, point-based Mask2Former)
66.9
PQ· 2023-04-24
AutoFocusFormer: Image Segmentation off the Grid Code
#15Mask2Former (Swin-L)
66.6
PQ· 2021-12-02
Masked-attention Mask Transformer for Universal Image Segmentation Code
#16EfficientPS (Cityscapes-fine)
64.9
PQ· 2020-04-05
EfficientPS: Efficient Panoptic Segmentation Code
#17CMT-DeepLab (MaX-S, single-scale, IN-1K)
64.6
PQ· 2022-06-17
CMT-DeepLab: Clustering Mask Transformers for Panoptic Segmentation Code
#18Panoptic-DeepLab (X71)SOTA
64.1
PQ· Extra Data· 2019-11-22
Panoptic-DeepLab: A Simple, Strong, and Fast Baseline for Bottom-Up Panoptic Segmentation Code
#19Mask2Former + Intra-Batch Supervision (ResNet-50)
62.4
PQ· 2023-04-17
Intra-Batch Supervision for Panoptic Segmentation on High-Resolution Images Code
#20COPS (ResNet-50)
62.1
PQ· 2021-06-06
Combinatorial Optimization for Panoptic Segmentation: A Fully Differentiable Approach Code
#21AdaptIS (ResNeXt-101)SOTA
62
PQ· 2019-09-17
AdaptIS: Adaptive Instance Selection Network
#22UPSNet (ResNet-101, multiscale)SOTA
61.8
PQ· Extra Data· 2019-01-12
UPSNet: A Unified Panoptic Segmentation Network Code
#23Panoptic FCN* (ResNet-FPN)
61.4
PQ· 2020-12-01
Fully Convolutional Networks for Panoptic Segmentation Code
#24MRCNN + PSPNet (ResNet-101)SOTA
61.2
PQ· Extra Data· 2018-01-03
Panoptic Segmentation Code
#25AdaptIS (ResNet-101)
60.6
PQ· 2019-09-17
AdaptIS: Adaptive Instance Selection Network
#26UPSNet (ResNet-101)
60.5
PQ· Extra Data· 2019-01-12
UPSNet: A Unified Panoptic Segmentation Network Code
#27TASCNet (ResNet-50, multi-scale)
60.4
PQ· Extra Data· 2018-12-04
Learning to Fuse Things and Stuff
#28UPSNet (ResNet-50)
59.3
PQ· 2019-01-12
UPSNet: A Unified Panoptic Segmentation Network Code
#29TASCNet (ResNet-50)
59.2
PQ· Extra Data· 2018-12-04
Learning to Fuse Things and Stuff
#30AUNet (ResNet-101-FPN)
59
PQ· 2018-12-10
Attention-guided Unified Network for Panoptic Segmentation
#31AdaptIS (ResNet-50)
59
PQ· 2019-09-17
AdaptIS: Adaptive Instance Selection Network
#32Panoptic FPN (ResNet-101)
58.1
PQ· 2019-01-08
Panoptic Feature Pyramid Networks Code
#33DeeperLab (Xception-71)
56.5
PQ· 2019-02-13
DeeperLab: Single-Shot Image Parser
#34Dynamically Instantiated Network (ResNet-101)
53.8
PQ· 2018-08-10
Weakly- and Semi-Supervised Panoptic Segmentation Code