Semantic Segmentation on COCO minival

Metric: mIoU (higher is better)

LeaderboardDataset

Loading chart...

Results

Submit a result

Hide extra data

Sort:

#	Model↕	mIoU▼	Extra Data	Paper	Date↕	Code
1	UMG-CLIP-E/14	69.7	Yes	UMG-CLIP: A Unified Multi-Granularity Vision Gen...	2024-01-12	Code
2	UMG-CLIP-L/14	68.9	Yes	UMG-CLIP: A Unified Multi-Granularity Vision Gen...	2024-01-12	Code
3	OneFormer (InternImage-H,single-scale)	68.8	No	OneFormer: One Transformer to Rule Universal Ima...	2022-11-10	Code
4	DiNAT-L (single-scale, Mask2Former)	68.3	No	Dilated Neighborhood Attention Transformer	2022-09-29	Code
5	OneFormer (DiNAT-L, single-scale)	68.1	No	OneFormer: One Transformer to Rule Universal Ima...	2022-11-10	Code
6	OneFormer (Swin-L, single-scale)	67.4	No	OneFormer: One Transformer to Rule Universal Ima...	2022-11-10	Code
7	HIPIE (ViT-H, single-scale)	66.8	Yes	Hierarchical Open-vocabulary Universal Image Seg...	2023-07-03	Code

#1UMG-CLIP-E/14SOTA
69.7
mIoU· Extra Data· 2024-01-12
UMG-CLIP: A Unified Multi-Granularity Vision Generalist for Open-World Understanding Code
#2UMG-CLIP-L/14
68.9
mIoU· Extra Data· 2024-01-12
UMG-CLIP: A Unified Multi-Granularity Vision Generalist for Open-World Understanding Code
#3OneFormer (InternImage-H,single-scale)SOTA
68.8
mIoU· 2022-11-10
OneFormer: One Transformer to Rule Universal Image Segmentation Code
#4DiNAT-L (single-scale, Mask2Former)SOTA
68.3
mIoU· 2022-09-29
Dilated Neighborhood Attention Transformer Code
#5OneFormer (DiNAT-L, single-scale)
68.1
mIoU· 2022-11-10
OneFormer: One Transformer to Rule Universal Image Segmentation Code
#6OneFormer (Swin-L, single-scale)
67.4
mIoU· 2022-11-10
OneFormer: One Transformer to Rule Universal Image Segmentation Code
#7HIPIE (ViT-H, single-scale)
66.8
mIoU· Extra Data· 2023-07-03
Hierarchical Open-vocabulary Universal Image Segmentation Code