Tasks
SotA
Datasets
Papers
Methods
Submit
About
SotA
/
Computer Vision
/
Video Instance Segmentation
/
OVIS validation
Video Instance Segmentation on OVIS validation
Metric: mask AP (higher is better)
Leaderboard
Dataset
Loading chart...
Results
Submit a result
Hide extra data
Export CSV
#
Model
↕
mask AP
▼
Extra Data
Paper
Date
↕
Code
1
DVIS-DAQ(VIT-L, Offline)
57.1
Yes
DVIS-DAQ: Improving Video Segmentation via Dynam...
2024-03-29
Code
2
CAVIS(VIT-L, Offline)
57.1
Yes
Context-Aware Video Instance Segmentation
2024-07-03
Code
3
DVIS++(VIT-L,Offline)
53.4
Yes
DVIS++: Improved Decoupled Framework for Univers...
2023-12-20
Code
4
GLEE-Pro
50.4
Yes
General Object Foundation Model for Images and V...
2023-12-14
Code
5
DVIS(Swin-L, Offline)
49.9
No
DVIS: Decoupled Video Instance Segmentation Fram...
2023-06-06
Code
6
DVIS++(VIT-L, Online)
49.6
Yes
DVIS++: Improved Decoupled Framework for Univers...
2023-12-20
Code
7
UNINEXT (ViT-H, Online)
49
Yes
Universal Instance Perception as Object Discover...
2023-03-12
Code
8
DVIS(Swin-L, Online)
47.1
No
DVIS: Decoupled Video Instance Segmentation Fram...
2023-06-06
Code
9
CTVIS (Swin-L)
46.9
Yes
CTVIS: Consistent Training for Online Video Inst...
2023-07-24
Code
10
RefineVIS (Swin-L, offline)
46
Yes
RefineVIS: Video Instance Segmentation with Temp...
2023-06-07
-
11
GRAtt-VIS (Swin-L)
45.7
Yes
GRAtt-VIS: Gated Residual Attention for Auto Rec...
2023-05-26
Code
12
GenVIS (Swin-L)
45.4
Yes
A Generalized Framework for Video Instance Segme...
2022-11-16
Code
13
NOVIS (Swin-L)
43.5
Yes
NOVIS: A Case for End-to-End Near-Online Video I...
2023-08-29
-
14
TarViS (Swin-L)
43.2
Yes
TarViS: A Unified Approach for Target-based Vide...
2023-01-06
Code
15
MDQE(SwinL)
42.6
No
MDQE: Mining Discriminative Query Embeddings to ...
2023-03-25
Code
16
IDOL (Swin-L)
42.6
No
In Defense of Online Models for Video Instance S...
2022-07-21
Code
17
ROVIS (Swin-L)
42.6
No
Robust Online Video Instance Segmentation with T...
2022-11-16
Code
18
UniVS(Swin-L)
41.7
Yes
UniVS: Unified and Universal Video Segmentation ...
2024-02-28
Code
19
DVIS++(R50, Offline)
41.2
Yes
DVIS++: Improved Decoupled Framework for Univers...
2023-12-20
Code
20
BoxVIS(Swin-L & Box-sup)
40.6
No
BoxVIS: Video Instance Segmentation with Box Ann...
2023-03-26
Code
21
MinVIS (Swin-L)
39.4
No
MinVIS: A Minimal Video Instance Segmentation Fr...
2022-08-03
Code
22
DVIS++(R50, Online)
37.2
Yes
DVIS++: Improved Decoupled Framework for Univers...
2023-12-20
Code
23
GRAtt-VIS (ResNet-50)
36.2
Yes
GRAtt-VIS: Gated Residual Attention for Auto Rec...
2023-05-26
Code
24
CTVIS (ResNet-50)
35.5
Yes
CTVIS: Consistent Training for Online Video Inst...
2023-07-24
Code
25
DeVIS (Swin-L)
35.5
No
DeVIS: Making Deformable Transformers Work for V...
2022-07-22
Code
26
UNINEXT (ResNet-50, Online)
34
Yes
Universal Instance Perception as Object Discover...
2023-03-12
Code
27
TarViS (Swin-T)
34
Yes
TarViS: A Unified Approach for Target-based Vide...
2023-01-06
Code
28
NOVIS (ResNet-50)
32.7
Yes
NOVIS: A Case for End-to-End Near-Online Video I...
2023-08-29
-
29
TarViS (ResNet-50)
31.1
Yes
TarViS: A Unified Approach for Target-based Vide...
2023-01-06
Code
30
IDOL (ResNet-50)
30.2
No
In Defense of Online Models for Video Instance S...
2022-07-21
Code
31
Tube-Link(ResNet-50)
29.5
No
Tube-Link: A Flexible Cross Tube Framework for U...
2023-03-22
Code
32
VITA (Swin-L)
27.7
Yes
VITA: Video Instance Segmentation via Object Tok...
2022-06-09
Code
33
DeVIS (ResNet-50)
23.7
No
DeVIS: Making Deformable Transformers Work for V...
2022-07-22
Code
34
InstanceFormer (Swin-L)
22.8
Yes
InstanceFormer: An Online Video Instance Segment...
2022-08-22
Code
35
InstanceFormer(ResNet-50)
20
Yes
InstanceFormer: An Online Video Instance Segment...
2022-08-22
Code
36
CrossVIS (ResNet-50, calibration)
18.1
No
Crossover Learning for Fast Online Video Instanc...
2021-04-13
Code
37
TeViT (ResNet-50)
17.4
No
Temporally Efficient Vision Transformer for Vide...
2022-04-18
Code
38
STMask(R101-DCN-FPN)
17.3
No
Spatial Feature Calibration and Temporal Fusion ...
2021-04-06
Code
39
Mask2Former-VIS
16.6
No
Mask2Former for Video Instance Segmentation
2021-12-20
Code
40
STC (ResNet-50)
15.5
No
STC: Spatio-Temporal Contrastive Learning for Vi...
2022-02-08
-
41
CMaskTrack R-CNN (ResNet-50)
15.4
No
Occluded Video Instance Segmentation: A Benchmark
2021-02-02
Code
42
D2Conv3D (ResNet-50)
15.2
No
-
-
Code
43
CrossVIS (ResNet-50)
14.9
No
Crossover Learning for Fast Online Video Instanc...
2021-04-13
Code
44
CSipMask (ResNet-50)
14.3
No
Occluded Video Instance Segmentation: A Benchmark
2021-02-02
Code