Tasks
SotA
Datasets
Papers
Methods
Submit
About
SotA
/
Computer Vision
/
Video Object Segmentation
/
Refer-YouTube-VOS
Video Object Segmentation on Refer-YouTube-VOS
Metric: J (higher is better)
Leaderboard
Dataset
Loading chart...
Results
Submit a result
Hide extra data
Export CSV
#
Model
↕
J
▼
Extra Data
Paper
Date
↕
Code
1
FindTrack
71.8
Yes
Find First, Track Next: Decoupling Identificatio...
2025-03-05
Code
2
GLEE-Pro
68.2
Yes
General Object Foundation Model for Images and V...
2023-12-14
Code
3
GLEE-Plus
65.6
Yes
General Object Foundation Model for Images and V...
2023-12-14
Code
4
HTR
65.3
Yes
Temporally Consistent Referring Video Object Seg...
2024-03-28
Code
5
SOC
64.1
Yes
SOC: Semantic-Assisted Object Cluster for Referr...
2023-05-26
Code
6
SgMg
63.9
Yes
Spectrum-guided Multi-granularity Referring Vide...
2023-07-25
Code
7
VATEX
63.3
No
Vision-Aware Text Features in Referring Image Se...
2024-04-12
Code
8
VLT
61.9
Yes
VLT: Vision-Language Transformer and Query Gener...
2022-10-28
Code
9
HTML-SwinL
61.5
Yes
-
-
-
10
HTML-Video-SwinB
61.5
Yes
-
-
-
11
ReferFormer (Large)
61.3
Yes
Language as Queries for Referring Video Object S...
2022-01-03
Code
12
HTML-Video-SwinS
59.9
Yes
-
-
-
13
HTML-Video-SwinT
59.5
Yes
-
-
-
14
R2VOS (Swin-T)
58.9
No
Towards Robust Referring Video Object Segmentati...
2022-07-04
Code
15
HTML-ResNet101
57.3
Yes
-
-
-
16
HTML-ResNet50
56.5
Yes
-
-
-
17
CMSA
34.8
Yes
Cross-Modal Self-Attention Network for Referring...
2019-04-09
Code