Tasks
SotA
Datasets
Papers
Methods
Submit
About
SotA
/
Computer Vision
/
Referring Expression Segmentation
/
RefCOCO+ val
Referring Expression Segmentation on RefCOCO+ val
Metric: Overall IoU (higher is better)
Leaderboard
Dataset
Loading chart...
Results
Submit a result
Hide extra data
Export CSV
#
Model
↕
Overall IoU
▼
Extra Data
Paper
Date
↕
Code
1
MLCD-Seg-7B
79.4
Yes
Multi-label Cluster Discrimination for Visual Re...
2024-07-24
Code
2
DeRIS-L
79.01
No
DeRIS: Decoupling Perception and Cognition for E...
2025-07-02
Code
3
HyperSeg
79
Yes
HyperSeg: Towards Universal Visual Segmentation ...
2024-11-26
Code
4
EVF-SAM
76.5
Yes
EVF-SAM: Early Vision-Language Fusion for Text-P...
2024-06-28
Code
5
DETRIS
75.2
No
Densely Connected Parameter-Efficient Tuning for...
2025-01-15
Code
6
C3VG
74.68
No
Multi-task Visual Grounding with Coarse-to-Fine ...
2025-01-12
Code
7
HIPIE
73.9
Yes
Hierarchical Open-vocabulary Universal Image Seg...
2023-07-03
Code
8
UniLSeg-100
73.18
Yes
Universal Segmentation at Arbitrary Granularity ...
2023-12-04
Code
9
UniLSeg-20
72.7
Yes
Universal Segmentation at Arbitrary Granularity ...
2023-12-04
Code
10
SegAgent
72.49
No
SegAgent: Exploring Pixel Understanding Capabili...
2025-03-11
Code
11
UNINEXT-H
72.47
Yes
Universal Instance Perception as Object Discover...
2023-03-12
Code
12
SafaRi-B
70.78
No
SafaRi:Adaptive Sequence Transformer for Weakly ...
2024-07-02
-
13
GROUNDHOG
70.5
Yes
GROUNDHOG: Grounding Large Language Models to Ho...
2024-02-26
-
14
MaskRIS (Swin-B, combined DB)
70.26
No
MaskRIS: Semantic Distortion-aware Data Augmenta...
2024-11-28
Code
15
GLEE-Pro
69.6
Yes
General Object Foundation Model for Images and V...
2023-12-14
Code
16
PolyFormer-L
69.33
Yes
PolyFormer: Referring Image Segmentation as Sequ...
2023-02-14
Code
17
PolyFormer-B
67.64
Yes
PolyFormer: Referring Image Segmentation as Sequ...
2023-02-14
Code
18
MaskRIS (Swin-B)
67.54
No
MaskRIS: Semantic Distortion-aware Data Augmenta...
2024-11-28
Code
19
MagNet
66.16
No
Mask Grounding for Referring Image Segmentation
2023-12-19
Code
20
ReLA
66.04
No
GRES: Generalized Referring Expression Segmentat...
2023-06-01
Code
21
VLT
63.53
No
VLT: Vision-Language Transformer and Query Gener...
2022-10-28
Code
22
CRIS
62.27
No
CRIS: CLIP-Driven Referring Image Segmentation
2021-11-30
Code
23
MaIL
62.23
No
MaIL: A Unified Mask-Image-Language Trimodal Net...
2021-11-21
-
24
LAVT
62.14
No
LAVT: Language-Aware Vision Transformer for Refe...
2021-12-04
Code
25
VLT
55.5
No
Vision-Language Transformer and Query Generation...
2021-08-12
Code
26
SHNet
52.75
No
Comprehensive Multi-Modal Interactions for Refer...
2021-04-21
Code
27
CPMC
49.56
No
Referring Image Segmentation via Cross-Modal Pro...
2020-10-01
Code
28
BRINet
48.57
No
-
-
-
29
STEP (5-fold)
48.18
No
-
-
-
30
MattNet
46.67
No
MAttNet: Modular Attention Network for Referring...
2018-01-24
Code
31
RefVOS with BERT + MLM loss
44.71
No
RefVOS: A Closer Look at Referring Expressions f...
2020-10-01
Code
32
CMSA
43.76
No
Cross-Modal Self-Attention Network for Referring...
2019-04-09
Code