Metric: mIoU (higher is better)
| # | Model↕ | mIoU▼ | Extra Data | Paper | Date↕ | Code |
|---|---|---|---|---|---|---|
| 1 | UMG-CLIP-E/14 | 69.7 | Yes | UMG-CLIP: A Unified Multi-Granularity Vision Gen... | 2024-01-12 | Code |
| 2 | UMG-CLIP-L/14 | 68.9 | Yes | UMG-CLIP: A Unified Multi-Granularity Vision Gen... | 2024-01-12 | Code |
| 3 | OneFormer (InternImage-H,single-scale) | 68.8 | No | OneFormer: One Transformer to Rule Universal Ima... | 2022-11-10 | Code |
| 4 | DiNAT-L (single-scale, Mask2Former) | 68.3 | No | Dilated Neighborhood Attention Transformer | 2022-09-29 | Code |
| 5 | OneFormer (DiNAT-L, single-scale) | 68.1 | No | OneFormer: One Transformer to Rule Universal Ima... | 2022-11-10 | Code |
| 6 | OneFormer (Swin-L, single-scale) | 67.4 | No | OneFormer: One Transformer to Rule Universal Ima... | 2022-11-10 | Code |
| 7 | HIPIE (ViT-H, single-scale) | 66.8 | Yes | Hierarchical Open-vocabulary Universal Image Seg... | 2023-07-03 | Code |