Tasks
SotA
Datasets
Papers
Methods
Submit
About
SotA
/
Computer Vision
/
Unsupervised Semantic Segmentation
/
COCO-Stuff-171
Unsupervised Semantic Segmentation on COCO-Stuff-171
Metric: mIoU (higher is better)
Leaderboard
Dataset
Loading chart...
Results
Submit a result
Hide extra data
Export CSV
Sort:
mIoU (best first)
mIoU (worst first)
Date (newest first)
Date (oldest first)
Model name (A→Z)
#
Model
↕
mIoU
▼
Extra Data
Paper
Date
↕
Code
1
CorrCLIP
34
No
CorrCLIP: Reconstructing Correlations in CLIP wi...
2024-11-15
Code
2
TextRegion
31.2
No
TextRegion: Text-Aligned Region Tokens from Froz...
2025-05-29
Code
3
Trident
28.6
No
Harnessing Vision Foundation Models for High-Per...
2024-11-14
Code
4
ProxyCLIP
26.8
No
ProxyCLIP: Proxy Attention Improves CLIP for Ope...
2024-08-09
Code
5
TagAlign
25.3
No
TagAlign: Improving Vision-Language Alignment wi...
2023-12-21
Code
6
TTD (TCL)
23.7
No
TTD: Text-Tag Self-Distillation Enhancing Image-...
2024-03-30
Code
7
COSMOS ViT-B/16
23.2
No
COSMOS: Cross-Modality Self-Distillation for Vis...
2024-12-02
Code
8
TCL
22.4
No
Learning to Generate Text-grounded Mask for Open...
2022-12-01
Code
9
TTD (MaskCLIP)
19.4
No
TTD: Text-Tag Self-Distillation Enhancing Image-...
2024-03-30
Code
10
MaskCLIP
16.4
No
Extract Free Dense Labels from CLIP
2021-12-02
Code
11
CAUSE-TR (ViT-S/8)
15.2
No
Causal Unsupervised Semantic Segmentation
2023-10-11
Code
12
ReCo
14.8
No
ReCo: Retrieve and Co-segment for Zero-shot Tran...
2022-06-14
Code
13
TransFGU (ViT-S/8)
11.93
Yes
TransFGU: A Top-down Approach to Fine-Grained Un...
2021-12-02
Code
14
GroupViT
11.1
No
GroupViT: Semantic Segmentation Emerges from Tex...
2022-02-22
Code
15
PiCIE (ResNet-50)
5.6
No
PiCIE: Unsupervised Semantic Segmentation using ...
2021-03-30
Code
16
IIC (ResNet-50)
2.2
No
Invariant Information Clustering for Unsupervise...
2018-07-17
Code
#1
CorrCLIP
SOTA
34
mIoU
· 2024-11-15
CorrCLIP: Reconstructing Correlations in CLIP with Off-the-Shelf Foundation Models for Open-Vocabulary Semantic Segmentation
Code
#2
TextRegion
31.2
mIoU
· 2025-05-29
TextRegion: Text-Aligned Region Tokens from Frozen Image-Text Models
Code
#3
Trident
SOTA
28.6
mIoU
· 2024-11-14
Harnessing Vision Foundation Models for High-Performance, Training-Free Open Vocabulary Segmentation
Code
#4
ProxyCLIP
SOTA
26.8
mIoU
· 2024-08-09
ProxyCLIP: Proxy Attention Improves CLIP for Open-Vocabulary Segmentation
Code
#5
TagAlign
SOTA
25.3
mIoU
· 2023-12-21
TagAlign: Improving Vision-Language Alignment with Multi-Tag Classification
Code
#6
TTD (TCL)
23.7
mIoU
· 2024-03-30
TTD: Text-Tag Self-Distillation Enhancing Image-Text Alignment in CLIP to Alleviate Single Tag Bias
Code
#7
COSMOS ViT-B/16
23.2
mIoU
· 2024-12-02
COSMOS: Cross-Modality Self-Distillation for Vision Language Pre-training
Code
#8
TCL
SOTA
22.4
mIoU
· 2022-12-01
Learning to Generate Text-grounded Mask for Open-world Semantic Segmentation from Only Image-Text Pairs
Code
#9
TTD (MaskCLIP)
19.4
mIoU
· 2024-03-30
TTD: Text-Tag Self-Distillation Enhancing Image-Text Alignment in CLIP to Alleviate Single Tag Bias
Code
#10
MaskCLIP
SOTA
16.4
mIoU
· 2021-12-02
Extract Free Dense Labels from CLIP
Code
#11
CAUSE-TR (ViT-S/8)
15.2
mIoU
· 2023-10-11
Causal Unsupervised Semantic Segmentation
Code
#12
ReCo
14.8
mIoU
· 2022-06-14
ReCo: Retrieve and Co-segment for Zero-shot Transfer
Code
#13
TransFGU (ViT-S/8)
11.93
mIoU
· Extra Data
· 2021-12-02
TransFGU: A Top-down Approach to Fine-Grained Unsupervised Semantic Segmentation
Code
#14
GroupViT
11.1
mIoU
· 2022-02-22
GroupViT: Semantic Segmentation Emerges from Text Supervision
Code
#15
PiCIE (ResNet-50)
SOTA
5.6
mIoU
· 2021-03-30
PiCIE: Unsupervised Semantic Segmentation using Invariance and Equivariance in Clustering
Code
#16
IIC (ResNet-50)
SOTA
2.2
mIoU
· 2018-07-17
Invariant Information Clustering for Unsupervised Image Classification and Segmentation
Code