Tasks
SotA
Datasets
Papers
Methods
Submit
About
SotA
/
Computer Vision
/
Video
/
Refer-YouTube-VOS
Video on Refer-YouTube-VOS
Metric: F (higher is better)
Leaderboard
Dataset
Loading chart...
Results
Submit a result
Hide extra data
Export CSV
Sort:
F (best first)
F (worst first)
Date (newest first)
Date (oldest first)
Model name (A→Z)
#
Model
↕
F
▼
Extra Data
Paper
Date
↕
Code
1
FindTrack
75.7
Yes
Find First, Track Next: Decoupling Identificatio...
2025-03-05
Code
2
GLEE-Pro
72.9
Yes
General Object Foundation Model for Images and V...
2023-12-14
Code
3
GLEE-Plus
69.7
Yes
General Object Foundation Model for Images and V...
2023-12-14
Code
4
HTR
68.9
Yes
Temporally Consistent Referring Video Object Seg...
2024-03-28
Code
5
SOC
67.9
Yes
SOC: Semantic-Assisted Object Cluster for Referr...
2023-05-26
Code
6
VATEX
67.5
No
Vision-Aware Text Features in Referring Image Se...
2024-04-12
Code
7
SgMg
67.4
Yes
Spectrum-guided Multi-granularity Referring Vide...
2023-07-25
Code
8
VLT
65.6
Yes
VLT: Vision-Language Transformer and Query Gener...
2022-10-28
Code
9
HTML-SwinL
65.3
Yes
-
-
-
10
HTML-Video-SwinB
65.2
Yes
-
-
-
11
ReferFormer (Large)
64.6
Yes
Language as Queries for Referring Video Object S...
2022-01-03
Code
12
HTML-Video-SwinT
63
Yes
-
-
-
13
HTML-Video-SwinS
62.9
Yes
-
-
-
14
R2VOS (Swin-T)
61.5
No
Towards Robust Referring Video Object Segmentati...
2022-07-04
Code
15
HTML-ResNet101
59.8
Yes
-
-
-
16
HTML-ResNet50
59
Yes
-
-
-
17
CMSA
38.1
Yes
Cross-Modal Self-Attention Network for Referring...
2019-04-09
Code
#1
FindTrack
SOTA
75.7
F
· Extra Data
· 2025-03-05
Find First, Track Next: Decoupling Identification and Propagation in Referring Video Object Segmentation
Code
#2
GLEE-Pro
SOTA
72.9
F
· Extra Data
· 2023-12-14
General Object Foundation Model for Images and Videos at Scale
Code
#3
GLEE-Plus
69.7
F
· Extra Data
· 2023-12-14
General Object Foundation Model for Images and Videos at Scale
Code
#4
HTR
68.9
F
· Extra Data
· 2024-03-28
Temporally Consistent Referring Video Object Segmentation with Hybrid Memory
Code
#5
SOC
SOTA
67.9
F
· Extra Data
· 2023-05-26
SOC: Semantic-Assisted Object Cluster for Referring Video Object Segmentation
Code
#6
VATEX
67.5
F
· 2024-04-12
Vision-Aware Text Features in Referring Image Segmentation: From Object Understanding to Context Understanding
Code
#7
SgMg
67.4
F
· Extra Data
· 2023-07-25
Spectrum-guided Multi-granularity Referring Video Object Segmentation
Code
#8
VLT
SOTA
65.6
F
· Extra Data
· 2022-10-28
VLT: Vision-Language Transformer and Query Generation for Referring Segmentation
Code
#9
HTML-SwinL
65.3
F
· Extra Data
No paper
#10
HTML-Video-SwinB
65.2
F
· Extra Data
No paper
#11
ReferFormer (Large)
SOTA
64.6
F
· Extra Data
· 2022-01-03
Language as Queries for Referring Video Object Segmentation
Code
#12
HTML-Video-SwinT
63
F
· Extra Data
No paper
#13
HTML-Video-SwinS
62.9
F
· Extra Data
No paper
#14
R2VOS (Swin-T)
61.5
F
· 2022-07-04
Towards Robust Referring Video Object Segmentation with Cyclic Relational Consensus
Code
#15
HTML-ResNet101
59.8
F
· Extra Data
No paper
#16
HTML-ResNet50
59
F
· Extra Data
No paper
#17
CMSA
SOTA
38.1
F
· Extra Data
· 2019-04-09
Cross-Modal Self-Attention Network for Referring Image Segmentation
Code