Tasks
SotA
Datasets
Papers
Methods
Submit
About
SotA
/
Computer Vision
/
Video
/
Refer-YouTube-VOS
Video on Refer-YouTube-VOS
Metric: J (higher is better)
Leaderboard
Dataset
Loading chart...
Results
Submit a result
Hide extra data
Export CSV
Sort:
J (best first)
J (worst first)
Date (newest first)
Date (oldest first)
Model name (A→Z)
#
Model
↕
J
▼
Extra Data
Paper
Date
↕
Code
1
FindTrack
71.8
Yes
Find First, Track Next: Decoupling Identificatio...
2025-03-05
Code
2
GLEE-Pro
68.2
Yes
General Object Foundation Model for Images and V...
2023-12-14
Code
3
GLEE-Plus
65.6
Yes
General Object Foundation Model for Images and V...
2023-12-14
Code
4
HTR
65.3
Yes
Temporally Consistent Referring Video Object Seg...
2024-03-28
Code
5
SOC
64.1
Yes
SOC: Semantic-Assisted Object Cluster for Referr...
2023-05-26
Code
6
SgMg
63.9
Yes
Spectrum-guided Multi-granularity Referring Vide...
2023-07-25
Code
7
VATEX
63.3
No
Vision-Aware Text Features in Referring Image Se...
2024-04-12
Code
8
VLT
61.9
Yes
VLT: Vision-Language Transformer and Query Gener...
2022-10-28
Code
9
HTML-SwinL
61.5
Yes
-
-
-
10
HTML-Video-SwinB
61.5
Yes
-
-
-
11
ReferFormer (Large)
61.3
Yes
Language as Queries for Referring Video Object S...
2022-01-03
Code
12
HTML-Video-SwinS
59.9
Yes
-
-
-
13
HTML-Video-SwinT
59.5
Yes
-
-
-
14
R2VOS (Swin-T)
58.9
No
Towards Robust Referring Video Object Segmentati...
2022-07-04
Code
15
HTML-ResNet101
57.3
Yes
-
-
-
16
HTML-ResNet50
56.5
Yes
-
-
-
17
CMSA
34.8
Yes
Cross-Modal Self-Attention Network for Referring...
2019-04-09
Code
#1
FindTrack
SOTA
71.8
J
· Extra Data
· 2025-03-05
Find First, Track Next: Decoupling Identification and Propagation in Referring Video Object Segmentation
Code
#2
GLEE-Pro
SOTA
68.2
J
· Extra Data
· 2023-12-14
General Object Foundation Model for Images and Videos at Scale
Code
#3
GLEE-Plus
65.6
J
· Extra Data
· 2023-12-14
General Object Foundation Model for Images and Videos at Scale
Code
#4
HTR
65.3
J
· Extra Data
· 2024-03-28
Temporally Consistent Referring Video Object Segmentation with Hybrid Memory
Code
#5
SOC
SOTA
64.1
J
· Extra Data
· 2023-05-26
SOC: Semantic-Assisted Object Cluster for Referring Video Object Segmentation
Code
#6
SgMg
63.9
J
· Extra Data
· 2023-07-25
Spectrum-guided Multi-granularity Referring Video Object Segmentation
Code
#7
VATEX
63.3
J
· 2024-04-12
Vision-Aware Text Features in Referring Image Segmentation: From Object Understanding to Context Understanding
Code
#8
VLT
SOTA
61.9
J
· Extra Data
· 2022-10-28
VLT: Vision-Language Transformer and Query Generation for Referring Segmentation
Code
#9
HTML-SwinL
61.5
J
· Extra Data
No paper
#10
HTML-Video-SwinB
61.5
J
· Extra Data
No paper
#11
ReferFormer (Large)
SOTA
61.3
J
· Extra Data
· 2022-01-03
Language as Queries for Referring Video Object Segmentation
Code
#12
HTML-Video-SwinS
59.9
J
· Extra Data
No paper
#13
HTML-Video-SwinT
59.5
J
· Extra Data
No paper
#14
R2VOS (Swin-T)
58.9
J
· 2022-07-04
Towards Robust Referring Video Object Segmentation with Cyclic Relational Consensus
Code
#15
HTML-ResNet101
57.3
J
· Extra Data
No paper
#16
HTML-ResNet50
56.5
J
· Extra Data
No paper
#17
CMSA
SOTA
34.8
J
· Extra Data
· 2019-04-09
Cross-Modal Self-Attention Network for Referring Image Segmentation
Code