Metric: mIoU (higher is better)
| # | Model↕ | mIoU▼ | Extra Data | Paper | Date↕ | Code |
|---|---|---|---|---|---|---|
| 1 | CorrCLIP | 76.7 | No | CorrCLIP: Reconstructing Correlations in CLIP wi... | 2024-11-15 | Code |
| 2 | TextRegion | 73.1 | No | TextRegion: Text-Aligned Region Tokens from Froz... | 2025-05-29 | Code |
| 3 | Trident | 70.8 | No | Harnessing Vision Foundation Models for High-Per... | 2024-11-14 | Code |
| 4 | CLS-SEG | 68.7 | No | TagCLIP: A Local-to-Global Framework to Enhance ... | 2023-12-20 | Code |
| 5 | ProxyCLIP | 65 | No | ProxyCLIP: Proxy Attention Improves CLIP for Ope... | 2024-08-09 | Code |
| 6 | TTD (TCL) | 61.1 | No | TTD: Text-Tag Self-Distillation Enhancing Image-... | 2024-03-30 | Code |
| 7 | TCL | 55 | No | Learning to Generate Text-grounded Mask for Open... | 2022-12-01 | Code |
| 8 | TagAlign | 53.9 | No | TagAlign: Improving Vision-Language Alignment wi... | 2023-12-21 | Code |
| 9 | SegCLIP | 52.6 | No | SegCLIP: Patch Aggregation with Learnable Center... | 2022-11-27 | Code |
| 10 | TTD (MaskCLIP) | 43.1 | No | TTD: Text-Tag Self-Distillation Enhancing Image-... | 2024-03-30 | Code |
| 11 | MaskCLIP | 29.3 | No | Extract Free Dense Labels from CLIP | 2021-12-02 | Code |