Metric: cIoU (higher is better)
| # | Model↕ | cIoU▼ | Extra Data | Paper | Date↕ | Code |
|---|---|---|---|---|---|---|
| 1 | DeRIS-L | 72 | No | DeRIS: Decoupling Perception and Cognition for E... | 2025-07-02 | Code |
| 2 | GSVA-Llama2-13B | 66.38 | Yes | GSVA: Generalized Segmentation via Multimodal La... | 2023-12-15 | Code |
| 3 | MABP | 65.69 | No | Bring Adaptive Binding Prototypes to Generalized... | 2024-05-24 | Code |
| 4 | HDC | 65.42 | No | CoHD: A Counting-Aware Hierarchical Decoding Fra... | 2024-05-24 | Code |
| 5 | GSVA-Vicuna-13B-v1.1 | 64.05 | Yes | GSVA: Generalized Segmentation via Multimodal La... | 2023-12-15 | Code |
| 6 | GSVA-Vicuna-7B-v1.1 | 63.29 | Yes | GSVA: Generalized Segmentation via Multimodal La... | 2023-12-15 | Code |
| 7 | ReLA | 62.42 | No | GRES: Generalized Referring Expression Segmentat... | 2023-06-01 | Code |
| 8 | LAVT | 57.64 | No | LAVT: Language-Aware Vision Transformer for Refe... | 2021-12-04 | Code |
| 9 | CRIS | 55.34 | No | CRIS: CLIP-Driven Referring Image Segmentation | 2021-11-30 | Code |
| 10 | VLT | 52.51 | No | Vision-Language Transformer and Query Generation... | 2021-08-12 | Code |
| 11 | LTS | 52.3 | No | Locate then Segment: A Strong Pipeline for Refer... | 2021-03-30 | - |
| 12 | MattNet | 47.51 | No | MAttNet: Modular Attention Network for Referring... | 2018-01-24 | Code |