Metric: mIoU (higher is better)
| # | Model↕ | mIoU▼ | Extra Data | Paper | Date↕ | Code |
|---|---|---|---|---|---|---|
| 1 | CorrCLIP | 49.4 | No | CorrCLIP: Reconstructing Correlations in CLIP wi... | 2024-11-15 | Code |
| 2 | Trident | 42.2 | No | Harnessing Vision Foundation Models for High-Per... | 2024-11-14 | Code |
| 3 | ProxyCLIP | 39.2 | No | ProxyCLIP: Proxy Attention Improves CLIP for Ope... | 2024-08-09 | Code |
| 4 | TTD (TCL) | 37.4 | No | TTD: Text-Tag Self-Distillation Enhancing Image-... | 2024-03-30 | Code |
| 5 | CLS-SEG | 35.3 | No | TagCLIP: A Local-to-Global Framework to Enhance ... | 2023-12-20 | Code |
| 6 | TagAlign | 33.3 | No | TagAlign: Improving Vision-Language Alignment wi... | 2023-12-21 | Code |
| 7 | TCL | 31.6 | No | Learning to Generate Text-grounded Mask for Open... | 2022-12-01 | Code |
| 8 | COSMOS ViT-B/16 | 31.3 | No | COSMOS: Cross-Modality Self-Distillation for Vis... | 2024-12-02 | Code |
| 9 | GroupViT (RedCaps) | 27.5 | No | GroupViT: Semantic Segmentation Emerges from Tex... | 2022-02-22 | Code |
| 10 | TTD (MaskCLIP) | 26.5 | No | TTD: Text-Tag Self-Distillation Enhancing Image-... | 2024-03-30 | Code |
| 11 | MaskCLIP | 20.6 | No | Extract Free Dense Labels from CLIP | 2021-12-02 | Code |
| 12 | ReCo | 15.7 | No | ReCo: Retrieve and Co-segment for Zero-shot Tran... | 2022-06-14 | Code |