Metric: PQ (higher is better)
| # | Model↕ | PQ▼ | Extra Data | Paper | Date↕ | Code |
|---|---|---|---|---|---|---|
| 1 | UMG-CLIP-E/14 | 31.6 | No | UMG-CLIP: A Unified Multi-Granularity Vision Gen... | 2024-01-12 | Code |
| 2 | PosSAM | 29.2 | No | PosSAM: Panoptic Open-vocabulary Segment Anything | 2024-03-14 | Code |
| 3 | UMG-CLIP-L/14 | 29.1 | No | UMG-CLIP: A Unified Multi-Granularity Vision Gen... | 2024-01-12 | Code |
| 4 | MAFT+ | 27.1 | No | Collaborative Vision-Text Representation Optimiz... | 2024-08-01 | Code |
| 5 | FC-CLIP | 26.8 | No | Convolutions Die Hard: Open-Vocabulary Segmentat... | 2023-08-04 | Code |
| 6 | CLIPSelf | 23.7 | No | CLIPSelf: Vision Transformer Distills Itself for... | 2023-10-02 | Code |
| 7 | ODISE(Caption) | 23.4 | No | Open-Vocabulary Panoptic Segmentation with Text-... | 2023-03-08 | Code |
| 8 | ODISE (Label) | 22.6 | No | Open-Vocabulary Panoptic Segmentation with Text-... | 2023-03-08 | Code |
| 9 | FreeSeg | 16.3 | No | FreeSeg: Unified, Universal and Open-Vocabulary ... | 2023-03-30 | - |
| 10 | MaskCLIP | 15.1 | No | Extract Free Dense Labels from CLIP | 2021-12-02 | Code |