Tasks
SotA
Datasets
Papers
Methods
Submit
About
SotA
/
Computer Vision
/
Panoptic Segmentation
/
ADE20K val
Panoptic Segmentation on ADE20K val
Metric: mIoU (higher is better)
Leaderboard
Dataset
Loading chart...
Results
Submit a result
Hide extra data
Export CSV
Sort:
mIoU (best first)
mIoU (worst first)
Date (newest first)
Date (oldest first)
Model name (A→Z)
#
Model
↕
mIoU
▼
Extra Data
Paper
Date
↕
Code
1
OneFormer (InternImage-H, emb_dim=256, single-scale, 896x896)
60.4
No
OneFormer: One Transformer to Rule Universal Ima...
2022-11-10
Code
2
X-Decoder (Davit-d5, Deform, single-scale, 1280x1280)
59.1
Yes
Generalized Decoding for Pixel, Image, and Langu...
2022-12-21
Code
3
OneFormer (DiNAT-L, single-scale, 1280x1280, COCO-Pretrain)
58.9
Yes
OneFormer: One Transformer to Rule Universal Ima...
2022-11-10
Code
4
OneFormer (DiNAT-L, single-scale, 1280x1280)
58.3
No
OneFormer: One Transformer to Rule Universal Ima...
2022-11-10
Code
5
OneFormer (DiNAT-L, single-scale, 640x640)
58.3
No
OneFormer: One Transformer to Rule Universal Ima...
2022-11-10
Code
6
X-Decoder (L)
58.1
Yes
Generalized Decoding for Pixel, Image, and Langu...
2022-12-21
Code
7
OneFormer (ConvNeXt-XL, single-scale, 640x640)
57.4
No
OneFormer: One Transformer to Rule Universal Ima...
2022-11-10
Code
8
OneFormer (Swin-L, single-scale, 1280x1280)
57
No
OneFormer: One Transformer to Rule Universal Ima...
2022-11-10
Code
9
OneFormer (Swin-L, single-scale, 640x640)
57
No
OneFormer: One Transformer to Rule Universal Ima...
2022-11-10
Code
10
OneFormer (ConvNeXt-L, single-scale, 640x640)
56.6
No
OneFormer: One Transformer to Rule Universal Ima...
2022-11-10
Code
11
DiNAT-L (Mask2Former, 640x640)
56.3
No
Dilated Neighborhood Attention Transformer
2022-09-29
Code
12
Mask2Former (Swin-L + FAPN, 640x640)
55.4
No
Masked-attention Mask Transformer for Universal ...
2021-12-02
Code
13
kMaX-DeepLab (ConvNeXt-L, single-scale, 1281x1281)
55.2
No
kMaX-DeepLab: k-means Mask Transformer
2022-07-08
Code
14
kMaX-DeepLab (ConvNeXt-L, single-scale, 641x641)
54.8
No
kMaX-DeepLab: k-means Mask Transformer
2022-07-08
Code
15
Mask2Former (Swin-L)
54.5
No
Masked-attention Mask Transformer for Universal ...
2021-12-02
Code
16
Panoptic-DeepLab (SwideRNet)
50
No
Masked-attention Mask Transformer for Universal ...
2021-12-02
Code
17
Mask2Former (ResNet-50, 640x640)
46.1
No
Masked-attention Mask Transformer for Universal ...
2021-12-02
Code
18
kMaX-DeepLab (ResNet50, single-scale, 1281x1281)
45.3
No
kMaX-DeepLab: k-means Mask Transformer
2022-07-08
Code
19
kMaX-DeepLab (ResNet50, single-scale, 641x641)
45
No
kMaX-DeepLab: k-means Mask Transformer
2022-07-08
Code
#1
OneFormer (InternImage-H, emb_dim=256, single-scale, 896x896)
SOTA
60.4
mIoU
· 2022-11-10
OneFormer: One Transformer to Rule Universal Image Segmentation
Code
#2
X-Decoder (Davit-d5, Deform, single-scale, 1280x1280)
59.1
mIoU
· Extra Data
· 2022-12-21
Generalized Decoding for Pixel, Image, and Language
Code
#3
OneFormer (DiNAT-L, single-scale, 1280x1280, COCO-Pretrain)
58.9
mIoU
· Extra Data
· 2022-11-10
OneFormer: One Transformer to Rule Universal Image Segmentation
Code
#4
OneFormer (DiNAT-L, single-scale, 1280x1280)
58.3
mIoU
· 2022-11-10
OneFormer: One Transformer to Rule Universal Image Segmentation
Code
#5
OneFormer (DiNAT-L, single-scale, 640x640)
58.3
mIoU
· 2022-11-10
OneFormer: One Transformer to Rule Universal Image Segmentation
Code
#6
X-Decoder (L)
58.1
mIoU
· Extra Data
· 2022-12-21
Generalized Decoding for Pixel, Image, and Language
Code
#7
OneFormer (ConvNeXt-XL, single-scale, 640x640)
57.4
mIoU
· 2022-11-10
OneFormer: One Transformer to Rule Universal Image Segmentation
Code
#8
OneFormer (Swin-L, single-scale, 1280x1280)
57
mIoU
· 2022-11-10
OneFormer: One Transformer to Rule Universal Image Segmentation
Code
#9
OneFormer (Swin-L, single-scale, 640x640)
57
mIoU
· 2022-11-10
OneFormer: One Transformer to Rule Universal Image Segmentation
Code
#10
OneFormer (ConvNeXt-L, single-scale, 640x640)
56.6
mIoU
· 2022-11-10
OneFormer: One Transformer to Rule Universal Image Segmentation
Code
#11
DiNAT-L (Mask2Former, 640x640)
SOTA
56.3
mIoU
· 2022-09-29
Dilated Neighborhood Attention Transformer
Code
#12
Mask2Former (Swin-L + FAPN, 640x640)
SOTA
55.4
mIoU
· 2021-12-02
Masked-attention Mask Transformer for Universal Image Segmentation
Code
#13
kMaX-DeepLab (ConvNeXt-L, single-scale, 1281x1281)
55.2
mIoU
· 2022-07-08
kMaX-DeepLab: k-means Mask Transformer
Code
#14
kMaX-DeepLab (ConvNeXt-L, single-scale, 641x641)
54.8
mIoU
· 2022-07-08
kMaX-DeepLab: k-means Mask Transformer
Code
#15
Mask2Former (Swin-L)
54.5
mIoU
· 2021-12-02
Masked-attention Mask Transformer for Universal Image Segmentation
Code
#16
Panoptic-DeepLab (SwideRNet)
50
mIoU
· 2021-12-02
Masked-attention Mask Transformer for Universal Image Segmentation
Code
#17
Mask2Former (ResNet-50, 640x640)
46.1
mIoU
· 2021-12-02
Masked-attention Mask Transformer for Universal Image Segmentation
Code
#18
kMaX-DeepLab (ResNet50, single-scale, 1281x1281)
45.3
mIoU
· 2022-07-08
kMaX-DeepLab: k-means Mask Transformer
Code
#19
kMaX-DeepLab (ResNet50, single-scale, 641x641)
45
mIoU
· 2022-07-08
kMaX-DeepLab: k-means Mask Transformer
Code