Metric: mIoU (higher is better)
| # | Model↕ | mIoU▼ | Extra Data | Paper | Date↕ | Code |
|---|---|---|---|---|---|---|
| 1 | HyperSeg | 77.2 | Yes | HyperSeg: Towards Universal Visual Segmentation ... | 2024-11-26 | Code |
| 2 | ViT-P (OneFormer, InternImage-H) | 69.1 | No | The Missing Point in Vision Transformers for Uni... | 2025-05-26 | Code |
| 3 | OneFormer (InternImage-H, emb_dim=1024, single-scale) | 68.8 | No | OneFormer: One Transformer to Rule Universal Ima... | 2022-11-10 | Code |
| 4 | ViT-P (OneFormer, DiNAT-L) | 68.8 | No | The Missing Point in Vision Transformers for Uni... | 2025-05-26 | Code |
| 5 | OneFormer (DiNAT-L, single-scale) | 68.1 | No | OneFormer: One Transformer to Rule Universal Ima... | 2022-11-10 | Code |
| 6 | OneFormer (Swin-L, single-scale) | 67.4 | No | OneFormer: One Transformer to Rule Universal Ima... | 2022-11-10 | Code |
| 7 | Mask2Former (Swin-L, single-scale) | 67.4 | No | Masked-attention Mask Transformer for Universal ... | 2021-12-02 | Code |
| 8 | MaskFormer (Swin-L, single-scale) | 64.8 | No | Masked-attention Mask Transformer for Universal ... | 2021-12-02 | Code |
| 9 | SegCLIP | 26.5 | No | SegCLIP: Patch Aggregation with Learnable Center... | 2022-11-27 | Code |