10-shot image generation on COCO minival

Metric: PQth (higher is better)

LeaderboardDataset

Loading chart...

Results

Hide extra data

Sort:

#	Model↕	PQth▼	Extra Data	Paper	Date↕	Code
1	OneFormer (InternImage-H,single-scale)	67.1	No	OneFormer: One Transformer to Rule Universal Ima...	2022-11-10	Code
2	ViT-Adapter-L (single-scale, BEiTv2 pretrain, Mask2Former)	65	No	Vision Transformer Adapter for Dense Predictions	2022-05-17	Code
3	DiNAT-L (single-scale, Mask2Former)	64.9	No	Dilated Neighborhood Attention Transformer	2022-09-29	Code
4	Visual Attention Network (VAN-B6 + Mask2Former)	64.8	No	Visual Attention Network	2022-02-20	Code
5	OneFormer (Swin-L, single-scale)	64.4	No	OneFormer: One Transformer to Rule Universal Ima...	2022-11-10	Code
6	kMaX-DeepLab (single-scale, pseudo-labels)	64.3	Yes	kMaX-DeepLab: k-means Mask Transformer	2022-07-08	Code
7	OneFormer (DiNAT-L, single-scale)	64.3	No	OneFormer: One Transformer to Rule Universal Ima...	2022-11-10	Code
8	kMaX-DeepLab (single-scale, drop query with 256 queries)	64.2	No	kMaX-DeepLab: k-means Mask Transformer	2022-07-08	Code
9	Mask2Former (single-scale)	64.2	No	Masked-attention Mask Transformer for Universal ...	2021-12-02	Code
10	kMaX-DeepLab (single-scale)	64	No	kMaX-DeepLab: k-means Mask Transformer	2022-07-08	Code
11	Panoptic SegFormer (single-scale)	61.7	No	Panoptic SegFormer: Delving Deeper into Panoptic...	2021-09-08	Code
12	CMT-DeepLab (single-scale)	61	No	CMT-DeepLab: Clustering Mask Transformers for Pa...	2022-06-17	Code
13	MaskFormer (single-scale)	58.5	No	Per-Pixel Classification is Not All You Need for...	2021-07-13	Code
14	Panoptic FCN* (Swin-L, single-scale)	58.5	No	Fully Convolutional Networks for Panoptic Segmen...	2020-12-01	Code
15	MaX-DeepLab-L (single-scale)	57	No	MaX-DeepLab: End-to-End Panoptic Segmentation wi...	2020-12-01	Code
16	Panoptic SegFormer (ResNet-101)	55.5	No	Panoptic SegFormer: Delving Deeper into Panoptic...	2021-09-08	Code
17	PanopticFPN+ResNeSt(single-scale)	55.1	No	ResNeSt: Split-Attention Networks	2020-04-19	Code
18	PanopticFPN++	51	No	End-to-End Object Detection with Transformers	2020-05-26	Code
19	DETR-R101 (ResNet-101)	50.5	No	End-to-End Object Detection with Transformers	2020-05-26	Code
20	Panoptic FCN* (ResNet-50-FPN)	50	No	Fully Convolutional Networks for Panoptic Segmen...	2020-12-01	Code
21	Axial-DeepLab-L(multi-scale)	48.6	No	Axial-DeepLab: Stand-Alone Axial-Attention for P...	2020-03-17	Code
22	Axial-DeepLab-L (single-scale)	48.5	No	Axial-DeepLab: Stand-Alone Axial-Attention for P...	2020-03-17	Code

#1OneFormer (InternImage-H,single-scale)SOTA
67.1
PQth· 2022-11-10
OneFormer: One Transformer to Rule Universal Image Segmentation Code
#2ViT-Adapter-L (single-scale, BEiTv2 pretrain, Mask2Former)SOTA
65
PQth· 2022-05-17
Vision Transformer Adapter for Dense Predictions Code
#3DiNAT-L (single-scale, Mask2Former)
64.9
PQth· 2022-09-29
Dilated Neighborhood Attention Transformer Code
#4Visual Attention Network (VAN-B6 + Mask2Former)SOTA
64.8
PQth· 2022-02-20
Visual Attention Network Code
#5OneFormer (Swin-L, single-scale)
64.4
PQth· 2022-11-10
OneFormer: One Transformer to Rule Universal Image Segmentation Code
#6kMaX-DeepLab (single-scale, pseudo-labels)
64.3
PQth· Extra Data· 2022-07-08
kMaX-DeepLab: k-means Mask Transformer Code
#7OneFormer (DiNAT-L, single-scale)
64.3
PQth· 2022-11-10
OneFormer: One Transformer to Rule Universal Image Segmentation Code
#8kMaX-DeepLab (single-scale, drop query with 256 queries)
64.2
PQth· 2022-07-08
kMaX-DeepLab: k-means Mask Transformer Code
#9Mask2Former (single-scale)SOTA
64.2
PQth· 2021-12-02
Masked-attention Mask Transformer for Universal Image Segmentation Code
#10kMaX-DeepLab (single-scale)
64
PQth· 2022-07-08
kMaX-DeepLab: k-means Mask Transformer Code
#11Panoptic SegFormer (single-scale)SOTA
61.7
PQth· 2021-09-08
Panoptic SegFormer: Delving Deeper into Panoptic Segmentation with Transformers Code
#12CMT-DeepLab (single-scale)
61
PQth· 2022-06-17
CMT-DeepLab: Clustering Mask Transformers for Panoptic Segmentation Code
#13MaskFormer (single-scale)
58.5
PQth· 2021-07-13
Per-Pixel Classification is Not All You Need for Semantic Segmentation Code
#14Panoptic FCN* (Swin-L, single-scale)SOTA
58.5
PQth· 2020-12-01
Fully Convolutional Networks for Panoptic Segmentation Code
#15MaX-DeepLab-L (single-scale)
57
PQth· 2020-12-01
MaX-DeepLab: End-to-End Panoptic Segmentation with Mask Transformers Code
#16Panoptic SegFormer (ResNet-101)
55.5
PQth· 2021-09-08
Panoptic SegFormer: Delving Deeper into Panoptic Segmentation with Transformers Code
#17PanopticFPN+ResNeSt(single-scale)SOTA
55.1
PQth· 2020-04-19
ResNeSt: Split-Attention Networks Code
#18PanopticFPN++
51
PQth· 2020-05-26
End-to-End Object Detection with Transformers Code
#19DETR-R101 (ResNet-101)
50.5
PQth· 2020-05-26
End-to-End Object Detection with Transformers Code
#20Panoptic FCN* (ResNet-50-FPN)
50
PQth· 2020-12-01
Fully Convolutional Networks for Panoptic Segmentation Code
#21Axial-DeepLab-L(multi-scale)SOTA
48.6
PQth· 2020-03-17
Axial-DeepLab: Stand-Alone Axial-Attention for Panoptic Segmentation Code
#22Axial-DeepLab-L (single-scale)
48.5
PQth· 2020-03-17
Axial-DeepLab: Stand-Alone Axial-Attention for Panoptic Segmentation Code