Tasks
SotA
Datasets
Papers
Methods
Submit
About
SotA
/
Computer Vision
/
Referring Expression Segmentation
/
A2D Sentences
Referring Expression Segmentation on A2D Sentences
Metric: IoU mean (higher is better)
Leaderboard
Dataset
Loading chart...
Results
Submit a result
Hide extra data
Export CSV
Sort:
IoU mean (best first)
IoU mean (worst first)
Date (newest first)
Date (oldest first)
Model name (A→Z)
#
Model
↕
IoU mean
▼
Extra Data
Paper
Date
↕
Code
1
SOC (Video-Swin-B)
0.725
Yes
SOC: Semantic-Assisted Object Cluster for Referr...
2023-05-26
Code
2
SgMg (Video-Swin-B)
0.72
Yes
Spectrum-guided Multi-granularity Referring Vide...
2023-07-25
Code
3
ReferFormer (Video-Swin-B)
0.703
Yes
Language as Queries for Referring Video Object S...
2022-01-03
Code
4
SOC (Video-Swin-T)
0.669
No
SOC: Semantic-Assisted Object Cluster for Referr...
2023-05-26
Code
5
ClawCraneNet
0.655
No
ClawCraneNet: Leveraging Object-level Relation f...
2021-03-19
-
6
MTTR (w=10)
0.64
No
End-to-End Referring Video Object Segmentation w...
2021-11-29
Code
7
MANET
0.632
No
Multi-Attention Network for Compressed Video Ref...
2022-07-26
Code
8
MTTR (w=8)
0.618
No
End-to-End Referring Video Object Segmentation w...
2021-11-29
Code
9
RefVOS
0.599
No
RefVOS: A Closer Look at Referring Expressions f...
2020-10-01
Code
10
VLIDE
0.598
No
Deeply Interleaved Two-Stream Encoder for Referr...
2022-03-30
-
11
Locater
0.597
No
Local-Global Context Aware Transformer for Langu...
2022-03-18
Code
12
CMPC-V (I3D)
0.573
No
Cross-Modal Progressive Comprehension for Referr...
2021-05-15
Code
13
Hui et al.
0.561
No
Collaborative Spatial-Temporal Modeling for Lang...
2021-05-14
-
14
mmmmtbvs
0.558
No
Modeling Motion with Multi-Modal Features for Te...
2022-04-06
Code
15
AAMN
0.552
No
Actor and Action Modular Network for Text-based ...
2020-11-02
-
16
CMDy
0.531
No
-
-
-
17
PRPE
0.529
No
-
-
-
18
HINet
0.529
No
-
-
-
19
CMPC-V (R2D)
0.515
No
Cross-Modal Progressive Comprehension for Referr...
2021-05-15
Code
20
RefVOS
0.497
No
-
-
-
21
ACGA
0.49
No
-
-
Code
22
VT-Capsule
0.46
No
-
-
-
23
CMSA+CFSA
0.432
No
Referring Segmentation in Images and Videos with...
2021-02-09
-
24
Gavriluyk el al. (Optical flow)
0.426
No
Actor and Action Video Segmentation from a Sente...
2018-03-20
Code
25
Gavriluyk el al.
0.421
No
Actor and Action Video Segmentation from a Sente...
2018-03-20
Code
26
Li et al.
0.354
No
-
-
-
27
Hu et al.
0.35
No
Segmentation from Natural Language Expressions
2016-03-20
Code
#1
SOC (Video-Swin-B)
SOTA
0.725
IoU mean
· Extra Data
· 2023-05-26
SOC: Semantic-Assisted Object Cluster for Referring Video Object Segmentation
Code
#2
SgMg (Video-Swin-B)
0.72
IoU mean
· Extra Data
· 2023-07-25
Spectrum-guided Multi-granularity Referring Video Object Segmentation
Code
#3
ReferFormer (Video-Swin-B)
SOTA
0.703
IoU mean
· Extra Data
· 2022-01-03
Language as Queries for Referring Video Object Segmentation
Code
#4
SOC (Video-Swin-T)
0.669
IoU mean
· 2023-05-26
SOC: Semantic-Assisted Object Cluster for Referring Video Object Segmentation
Code
#5
ClawCraneNet
SOTA
0.655
IoU mean
· 2021-03-19
ClawCraneNet: Leveraging Object-level Relation for Text-based Video Segmentation
#6
MTTR (w=10)
0.64
IoU mean
· 2021-11-29
End-to-End Referring Video Object Segmentation with Multimodal Transformers
Code
#7
MANET
0.632
IoU mean
· 2022-07-26
Multi-Attention Network for Compressed Video Referring Object Segmentation
Code
#8
MTTR (w=8)
0.618
IoU mean
· 2021-11-29
End-to-End Referring Video Object Segmentation with Multimodal Transformers
Code
#9
RefVOS
SOTA
0.599
IoU mean
· 2020-10-01
RefVOS: A Closer Look at Referring Expressions for Video Object Segmentation
Code
#10
VLIDE
0.598
IoU mean
· 2022-03-30
Deeply Interleaved Two-Stream Encoder for Referring Video Segmentation
#11
Locater
0.597
IoU mean
· 2022-03-18
Local-Global Context Aware Transformer for Language-Guided Video Segmentation
Code
#12
CMPC-V (I3D)
0.573
IoU mean
· 2021-05-15
Cross-Modal Progressive Comprehension for Referring Segmentation
Code
#13
Hui et al.
0.561
IoU mean
· 2021-05-14
Collaborative Spatial-Temporal Modeling for Language-Queried Video Actor Segmentation
#14
mmmmtbvs
0.558
IoU mean
· 2022-04-06
Modeling Motion with Multi-Modal Features for Text-Based Video Segmentation
Code
#15
AAMN
0.552
IoU mean
· 2020-11-02
Actor and Action Modular Network for Text-based Video Segmentation
#16
CMDy
0.531
IoU mean
No paper
#17
PRPE
0.529
IoU mean
No paper
#18
HINet
0.529
IoU mean
No paper
#19
CMPC-V (R2D)
0.515
IoU mean
· 2021-05-15
Cross-Modal Progressive Comprehension for Referring Segmentation
Code
#20
RefVOS
0.497
IoU mean
No paper
#21
ACGA
0.49
IoU mean
No paper
Code
#22
VT-Capsule
0.46
IoU mean
No paper
#23
CMSA+CFSA
0.432
IoU mean
· 2021-02-09
Referring Segmentation in Images and Videos with Cross-Modal Self-Attention Network
#24
Gavriluyk el al. (Optical flow)
SOTA
0.426
IoU mean
· 2018-03-20
Actor and Action Video Segmentation from a Sentence
Code
#25
Gavriluyk el al.
0.421
IoU mean
· 2018-03-20
Actor and Action Video Segmentation from a Sentence
Code
#26
Li et al.
0.354
IoU mean
No paper
#27
Hu et al.
SOTA
0.35
IoU mean
· 2016-03-20
Segmentation from Natural Language Expressions
Code