Tasks
SotA
Datasets
Papers
Methods
Submit
About
SotA
/
Computer Vision
/
Referring Expression Segmentation
/
RefCOCOg-val
Referring Expression Segmentation on RefCOCOg-val
Metric: Overall IoU (higher is better)
Leaderboard
Dataset
Loading chart...
Results
Submit a result
Hide extra data
Export CSV
Sort:
Overall IoU (best first)
Overall IoU (worst first)
Date (newest first)
Date (oldest first)
Model name (A→Z)
#
Model
↕
Overall IoU
▼
Extra Data
Paper
Date
↕
Code
1
MLCD-Seg-7B
79.9
Yes
Multi-label Cluster Discrimination for Visual Re...
2024-07-24
Code
2
HyperSeg
79.4
Yes
HyperSeg: Towards Universal Visual Segmentation ...
2024-11-26
Code
3
UniLSeg-100
79.27
Yes
Universal Segmentation at Arbitrary Granularity ...
2023-12-04
Code
4
UniLSeg-20
78.41
Yes
Universal Segmentation at Arbitrary Granularity ...
2023-12-04
Code
5
EVF-SAM
78.2
Yes
EVF-SAM: Early Vision-Language Fusion for Text-P...
2024-06-28
Code
6
SegAgent
75.11
No
SegAgent: Exploring Pixel Understanding Capabili...
2025-03-11
Code
7
DETRIS
74.6
No
Densely Connected Parameter-Efficient Tuning for...
2025-01-15
Code
8
C3VG
74.43
No
Multi-task Visual Grounding with Coarse-to-Fine ...
2025-01-12
Code
9
GROUNDHOG
74.1
Yes
GROUNDHOG: Grounding Large Language Models to Ho...
2024-02-26
-
10
GLEE-Pro
72.9
Yes
General Object Foundation Model for Images and V...
2023-12-14
Code
11
SafaRi-B
70.48
Yes
SafaRi:Adaptive Sequence Transformer for Weakly ...
2024-07-02
-
12
PolyFormer-L
69.2
Yes
PolyFormer: Referring Image Segmentation as Sequ...
2023-02-14
Code
13
MaskRIS (Swin-B, combined DB)
69.12
No
MaskRIS: Semantic Distortion-aware Data Augmenta...
2024-11-28
Code
14
PolyFormer-B
67.76
Yes
PolyFormer: Referring Image Segmentation as Sequ...
2023-02-14
Code
15
MaskRIS (Swin-B)
65.55
No
MaskRIS: Semantic Distortion-aware Data Augmenta...
2024-11-28
Code
16
MagNet
65.36
No
Mask Grounding for Referring Image Segmentation
2023-12-19
Code
17
X-Decoder (Davit-d5)
64.6
Yes
Generalized Decoding for Pixel, Image, and Langu...
2022-12-21
Code
18
VLT (Swin-B)
63.49
No
VLT: Vision-Language Transformer and Query Gener...
2022-10-28
Code
19
LAVT
61.24
No
LAVT: Language-Aware Vision Transformer for Refe...
2021-12-04
Code
20
VLT (Darknet53)
52.99
No
Vision-Language Transformer and Query Generation...
2021-08-12
Code
21
SHNet
49.9
No
Comprehensive Multi-Modal Interactions for Refer...
2021-04-21
Code
#1
MLCD-Seg-7B
SOTA
79.9
Overall IoU
· Extra Data
· 2024-07-24
Multi-label Cluster Discrimination for Visual Representation Learning
Code
#2
HyperSeg
79.4
Overall IoU
· Extra Data
· 2024-11-26
HyperSeg: Towards Universal Visual Segmentation with Large Language Model
Code
#3
UniLSeg-100
SOTA
79.27
Overall IoU
· Extra Data
· 2023-12-04
Universal Segmentation at Arbitrary Granularity with Language Instruction
Code
#4
UniLSeg-20
78.41
Overall IoU
· Extra Data
· 2023-12-04
Universal Segmentation at Arbitrary Granularity with Language Instruction
Code
#5
EVF-SAM
78.2
Overall IoU
· Extra Data
· 2024-06-28
EVF-SAM: Early Vision-Language Fusion for Text-Prompted Segment Anything Model
Code
#6
SegAgent
75.11
Overall IoU
· 2025-03-11
SegAgent: Exploring Pixel Understanding Capabilities in MLLMs by Imitating Human Annotator Trajectories
Code
#7
DETRIS
74.6
Overall IoU
· 2025-01-15
Densely Connected Parameter-Efficient Tuning for Referring Image Segmentation
Code
#8
C3VG
74.43
Overall IoU
· 2025-01-12
Multi-task Visual Grounding with Coarse-to-Fine Consistency Constraints
Code
#9
GROUNDHOG
74.1
Overall IoU
· Extra Data
· 2024-02-26
GROUNDHOG: Grounding Large Language Models to Holistic Segmentation
#10
GLEE-Pro
72.9
Overall IoU
· Extra Data
· 2023-12-14
General Object Foundation Model for Images and Videos at Scale
Code
#11
SafaRi-B
70.48
Overall IoU
· Extra Data
· 2024-07-02
SafaRi:Adaptive Sequence Transformer for Weakly Supervised Referring Expression Segmentation
#12
PolyFormer-L
SOTA
69.2
Overall IoU
· Extra Data
· 2023-02-14
PolyFormer: Referring Image Segmentation as Sequential Polygon Generation
Code
#13
MaskRIS (Swin-B, combined DB)
69.12
Overall IoU
· 2024-11-28
MaskRIS: Semantic Distortion-aware Data Augmentation for Referring Image Segmentation
Code
#14
PolyFormer-B
67.76
Overall IoU
· Extra Data
· 2023-02-14
PolyFormer: Referring Image Segmentation as Sequential Polygon Generation
Code
#15
MaskRIS (Swin-B)
65.55
Overall IoU
· 2024-11-28
MaskRIS: Semantic Distortion-aware Data Augmentation for Referring Image Segmentation
Code
#16
MagNet
65.36
Overall IoU
· 2023-12-19
Mask Grounding for Referring Image Segmentation
Code
#17
X-Decoder (Davit-d5)
SOTA
64.6
Overall IoU
· Extra Data
· 2022-12-21
Generalized Decoding for Pixel, Image, and Language
Code
#18
VLT (Swin-B)
SOTA
63.49
Overall IoU
· 2022-10-28
VLT: Vision-Language Transformer and Query Generation for Referring Segmentation
Code
#19
LAVT
SOTA
61.24
Overall IoU
· 2021-12-04
LAVT: Language-Aware Vision Transformer for Referring Image Segmentation
Code
#20
VLT (Darknet53)
SOTA
52.99
Overall IoU
· 2021-08-12
Vision-Language Transformer and Query Generation for Referring Segmentation
Code
#21
SHNet
SOTA
49.9
Overall IoU
· 2021-04-21
Comprehensive Multi-Modal Interactions for Referring Image Segmentation
Code