Tasks
SotA
Datasets
Papers
Methods
Submit
About
SotA
/
Computer Vision
/
Video Object Segmentation
/
MeViS
Video Object Segmentation on MeViS
Metric: J&F (higher is better)
Leaderboard
Dataset
Loading chart...
Results
Submit a result
Export CSV
Sort:
J&F (best first)
J&F (worst first)
Date (newest first)
Date (oldest first)
Model name (A→Z)
#
Model
↕
J&F
▼
Extra Data
Paper
Date
↕
Code
1
MPG-SAM 2
53.7
No
MPG-SAM 2: Adapting SAM 2 with Mask Priors and G...
2025-01-23
Code
2
FindTrack
53.2
No
Find First, Track Next: Decoupling Identificatio...
2025-03-05
Code
3
GLUS
51.3
No
GLUS: Global-Local Reasoning Unified into A Sing...
2025-04-10
Code
4
VRS-HQ (Chat-UniVi-13B)
50.9
No
The Devil is in Temporal Token: High Quality Vid...
2025-01-15
Code
5
ReferDINO (Swin-B)
49.3
No
ReferDINO: Referring Video Object Segmentation w...
2025-01-24
-
6
SAMWISE
48.3
No
SAMWISE: Infusing Wisdom in SAM2 for Text-Driven...
2024-11-26
Code
7
DsHmp + MTCM
47.6
No
Multi-Context Temporal Consistent Modeling for R...
2025-01-09
Code
8
DsHmp
46.4
No
Decoupling Static and Hierarchical Motion Percep...
2024-04-04
Code
9
HTR
42.7
No
Temporally Consistent Referring Video Object Seg...
2024-03-28
Code
10
LMPM
37.2
No
MeViS: A Large-scale Benchmark for Video Segment...
2023-08-16
Code
11
VLT+TC
35.5
No
VLT: Vision-Language Transformer and Query Gener...
2022-10-28
Code
12
InternVideo2.5
32
No
InternVideo2.5: Empowering Video MLLMs with Long...
2025-01-21
Code
13
ReferFormer
31
No
Language as Queries for Referring Video Object S...
2022-01-03
Code
14
MTTR
30
No
End-to-End Referring Video Object Segmentation w...
2021-11-29
Code
15
LBDT
29.3
No
Language-Bridged Spatial-Temporal Interaction fo...
2022-06-08
Code
16
URVOS
27.8
No
-
-
Code
#1
MPG-SAM 2
SOTA
53.7
J&F
· 2025-01-23
MPG-SAM 2: Adapting SAM 2 with Mask Priors and Global Context for Referring Video Object Segmentation
Code
#2
FindTrack
53.2
J&F
· 2025-03-05
Find First, Track Next: Decoupling Identification and Propagation in Referring Video Object Segmentation
Code
#3
GLUS
51.3
J&F
· 2025-04-10
GLUS: Global-Local Reasoning Unified into A Single Large Language Model for Video Segmentation
Code
#4
VRS-HQ (Chat-UniVi-13B)
SOTA
50.9
J&F
· 2025-01-15
The Devil is in Temporal Token: High Quality Video Reasoning Segmentation
Code
#5
ReferDINO (Swin-B)
49.3
J&F
· 2025-01-24
ReferDINO: Referring Video Object Segmentation with Visual Grounding Foundations
#6
SAMWISE
SOTA
48.3
J&F
· 2024-11-26
SAMWISE: Infusing Wisdom in SAM2 for Text-Driven Video Segmentation
Code
#7
DsHmp + MTCM
47.6
J&F
· 2025-01-09
Multi-Context Temporal Consistent Modeling for Referring Video Object Segmentation
Code
#8
DsHmp
SOTA
46.4
J&F
· 2024-04-04
Decoupling Static and Hierarchical Motion Perception for Referring Video Segmentation
Code
#9
HTR
SOTA
42.7
J&F
· 2024-03-28
Temporally Consistent Referring Video Object Segmentation with Hybrid Memory
Code
#10
LMPM
SOTA
37.2
J&F
· 2023-08-16
MeViS: A Large-scale Benchmark for Video Segmentation with Motion Expressions
Code
#11
VLT+TC
SOTA
35.5
J&F
· 2022-10-28
VLT: Vision-Language Transformer and Query Generation for Referring Segmentation
Code
#12
InternVideo2.5
32
J&F
· 2025-01-21
InternVideo2.5: Empowering Video MLLMs with Long and Rich Context Modeling
Code
#13
ReferFormer
SOTA
31
J&F
· 2022-01-03
Language as Queries for Referring Video Object Segmentation
Code
#14
MTTR
SOTA
30
J&F
· 2021-11-29
End-to-End Referring Video Object Segmentation with Multimodal Transformers
Code
#15
LBDT
29.3
J&F
· 2022-06-08
Language-Bridged Spatial-Temporal Interaction for Referring Video Object Segmentation
Code
#16
URVOS
27.8
J&F
No paper
Code