Tasks
SotA
Datasets
Papers
Methods
Submit
About
SotA
/
Computer Vision
/
Referring Expression Segmentation
/
A2D Sentences
Referring Expression Segmentation on A2D Sentences
Metric: Precision@0.5 (higher is better)
Leaderboard
Dataset
Loading chart...
Results
Submit a result
Hide extra data
Export CSV
Sort:
Precision@0.5 (best first)
Precision@0.5 (worst first)
Date (newest first)
Date (oldest first)
Model name (A→Z)
#
Model
↕
Precision@0.5
▼
Extra Data
Paper
Date
↕
Code
1
SOC (Video-Swin-B)
0.851
Yes
SOC: Semantic-Assisted Object Cluster for Referr...
2023-05-26
Code
2
SgMg (Video-Swin-B)
0.843
Yes
Spectrum-guided Multi-granularity Referring Vide...
2023-07-25
Code
3
ReferFormer (Video-Swin-B)
0.831
Yes
Language as Queries for Referring Video Object S...
2022-01-03
Code
4
SOC (Video-Swin-T)
0.79
No
SOC: Semantic-Assisted Object Cluster for Referr...
2023-05-26
Code
5
MTTR (w=10)
0.754
No
End-to-End Referring Video Object Segmentation w...
2021-11-29
Code
6
MANET
0.734
No
Multi-Attention Network for Compressed Video Ref...
2022-07-26
Code
7
MTTR (w=8)
0.721
No
End-to-End Referring Video Object Segmentation w...
2021-11-29
Code
8
Locater
0.709
No
Local-Global Context Aware Transformer for Langu...
2022-03-18
Code
9
ClawCraneNet
0.704
No
ClawCraneNet: Leveraging Object-level Relation f...
2021-03-19
-
10
VLIDE
0.702
No
Deeply Interleaved Two-Stream Encoder for Referr...
2022-03-30
-
11
AAMN
0.681
No
Actor and Action Modular Network for Text-based ...
2020-11-02
-
12
CMPC-V (I3D)
0.655
No
Cross-Modal Progressive Comprehension for Referr...
2021-05-15
Code
13
Hui et al.
0.654
No
Collaborative Spatial-Temporal Modeling for Lang...
2021-05-14
-
14
mmmmtbvs
0.645
No
Modeling Motion with Multi-Modal Features for Te...
2022-04-06
Code
15
PRPE
0.634
No
-
-
-
16
HINet
0.611
No
-
-
-
17
CMDy
0.607
No
-
-
-
18
CMPC-V (R2D)
0.59
No
Cross-Modal Progressive Comprehension for Referr...
2021-05-15
Code
19
RefVOS
0.578
No
-
-
-
20
ACGA
0.557
No
-
-
Code
21
VT-Capsule
0.526
No
-
-
-
22
Gavriluyk el al. (Optical flow)
0.5
No
Actor and Action Video Segmentation from a Sente...
2018-03-20
Code
23
RefVOS
0.495
No
RefVOS: A Closer Look at Referring Expressions f...
2020-10-01
Code
24
CMSA+CFSA
0.487
No
Referring Segmentation in Images and Videos with...
2021-02-09
-
25
Gavriluyk el al.
0.475
No
Actor and Action Video Segmentation from a Sente...
2018-03-20
Code
26
Li et al.
0.387
No
-
-
-
27
Hu et al.
0.348
No
Segmentation from Natural Language Expressions
2016-03-20
Code
#1
SOC (Video-Swin-B)
SOTA
0.851
Precision@0.5
· Extra Data
· 2023-05-26
SOC: Semantic-Assisted Object Cluster for Referring Video Object Segmentation
Code
#2
SgMg (Video-Swin-B)
0.843
Precision@0.5
· Extra Data
· 2023-07-25
Spectrum-guided Multi-granularity Referring Video Object Segmentation
Code
#3
ReferFormer (Video-Swin-B)
SOTA
0.831
Precision@0.5
· Extra Data
· 2022-01-03
Language as Queries for Referring Video Object Segmentation
Code
#4
SOC (Video-Swin-T)
0.79
Precision@0.5
· 2023-05-26
SOC: Semantic-Assisted Object Cluster for Referring Video Object Segmentation
Code
#5
MTTR (w=10)
SOTA
0.754
Precision@0.5
· 2021-11-29
End-to-End Referring Video Object Segmentation with Multimodal Transformers
Code
#6
MANET
0.734
Precision@0.5
· 2022-07-26
Multi-Attention Network for Compressed Video Referring Object Segmentation
Code
#7
MTTR (w=8)
0.721
Precision@0.5
· 2021-11-29
End-to-End Referring Video Object Segmentation with Multimodal Transformers
Code
#8
Locater
0.709
Precision@0.5
· 2022-03-18
Local-Global Context Aware Transformer for Language-Guided Video Segmentation
Code
#9
ClawCraneNet
SOTA
0.704
Precision@0.5
· 2021-03-19
ClawCraneNet: Leveraging Object-level Relation for Text-based Video Segmentation
#10
VLIDE
0.702
Precision@0.5
· 2022-03-30
Deeply Interleaved Two-Stream Encoder for Referring Video Segmentation
#11
AAMN
SOTA
0.681
Precision@0.5
· 2020-11-02
Actor and Action Modular Network for Text-based Video Segmentation
#12
CMPC-V (I3D)
0.655
Precision@0.5
· 2021-05-15
Cross-Modal Progressive Comprehension for Referring Segmentation
Code
#13
Hui et al.
0.654
Precision@0.5
· 2021-05-14
Collaborative Spatial-Temporal Modeling for Language-Queried Video Actor Segmentation
#14
mmmmtbvs
0.645
Precision@0.5
· 2022-04-06
Modeling Motion with Multi-Modal Features for Text-Based Video Segmentation
Code
#15
PRPE
0.634
Precision@0.5
No paper
#16
HINet
0.611
Precision@0.5
No paper
#17
CMDy
0.607
Precision@0.5
No paper
#18
CMPC-V (R2D)
0.59
Precision@0.5
· 2021-05-15
Cross-Modal Progressive Comprehension for Referring Segmentation
Code
#19
RefVOS
0.578
Precision@0.5
No paper
#20
ACGA
0.557
Precision@0.5
No paper
Code
#21
VT-Capsule
0.526
Precision@0.5
No paper
#22
Gavriluyk el al. (Optical flow)
SOTA
0.5
Precision@0.5
· 2018-03-20
Actor and Action Video Segmentation from a Sentence
Code
#23
RefVOS
0.495
Precision@0.5
· 2020-10-01
RefVOS: A Closer Look at Referring Expressions for Video Object Segmentation
Code
#24
CMSA+CFSA
0.487
Precision@0.5
· 2021-02-09
Referring Segmentation in Images and Videos with Cross-Modal Self-Attention Network
#25
Gavriluyk el al.
0.475
Precision@0.5
· 2018-03-20
Actor and Action Video Segmentation from a Sentence
Code
#26
Li et al.
0.387
Precision@0.5
No paper
#27
Hu et al.
SOTA
0.348
Precision@0.5
· 2016-03-20
Segmentation from Natural Language Expressions
Code