Tasks
SotA
Datasets
Papers
Methods
Submit
About
SotA
/
Computer Vision
/
Video Instance Segmentation
/
OVIS validation
Video Instance Segmentation on OVIS validation
Metric: AP75 (higher is better)
Leaderboard
Dataset
Loading chart...
Results
Submit a result
Hide extra data
Export CSV
#
Model
↕
AP75
▼
Extra Data
Paper
Date
↕
Code
1
CAVIS(VIT-L, Offline)
63.5
Yes
Context-Aware Video Instance Segmentation
2024-07-03
Code
2
DVIS-DAQ(VIT-L, Offline)
62.9
Yes
DVIS-DAQ: Improving Video Segmentation via Dynam...
2024-03-29
Code
3
DVIS++(VIT-L,Offline)
58.5
Yes
DVIS++: Improved Decoupled Framework for Univers...
2023-12-20
Code
4
GLEE-Pro
55.5
Yes
General Object Foundation Model for Images and V...
2023-12-14
Code
5
DVIS++(VIT-L, Online)
55
Yes
DVIS++: Improved Decoupled Framework for Univers...
2023-12-20
Code
6
DVIS(Swin-L, Offline)
53
No
DVIS: Decoupled Video Instance Segmentation Fram...
2023-06-06
Code
7
UNINEXT (ViT-H, Online)
52.2
Yes
Universal Instance Perception as Object Discover...
2023-03-12
Code
8
DVIS(Swin-L, Online)
49.2
No
DVIS: Decoupled Video Instance Segmentation Fram...
2023-06-06
Code
9
RefineVIS (Swin-L, offline)
48.4
Yes
RefineVIS: Video Instance Segmentation with Temp...
2023-06-07
-
10
GRAtt-VIS (Swin-L)
47.8
Yes
GRAtt-VIS: Gated Residual Attention for Auto Rec...
2023-05-26
Code
11
GenVIS (Swin-L)
47.8
Yes
A Generalized Framework for Video Instance Segme...
2022-11-16
Code
12
CTVIS (Swin-L)
47.5
Yes
CTVIS: Consistent Training for Online Video Inst...
2023-07-24
Code
13
IDOL (Swin-L)
45.2
No
In Defense of Online Models for Video Instance S...
2022-07-21
Code
14
TarViS (Swin-L)
44.6
Yes
TarViS: A Unified Approach for Target-based Vide...
2023-01-06
Code
15
MDQE(SwinL)
44.3
No
MDQE: Mining Discriminative Query Embeddings to ...
2023-03-25
Code
16
NOVIS (Swin-L)
43.8
Yes
NOVIS: A Case for End-to-End Near-Online Video I...
2023-08-29
-
17
ROVIS (Swin-L)
42.6
No
Robust Online Video Instance Segmentation with T...
2022-11-16
Code
18
MinVIS (Swin-L)
41.3
No
MinVIS: A Minimal Video Instance Segmentation Fr...
2022-08-03
Code
19
DVIS++(R50, Offline)
40.9
Yes
DVIS++: Improved Decoupled Framework for Univers...
2023-12-20
Code
20
BoxVIS(Swin-L & Box-sup)
39.9
No
BoxVIS: Video Instance Segmentation with Box Ann...
2023-03-26
Code
21
DeVIS (Swin-L)
38.3
No
DeVIS: Making Deformable Transformers Work for V...
2022-07-22
Code
22
DVIS++(R50, Online)
37.3
Yes
DVIS++: Improved Decoupled Framework for Univers...
2023-12-20
Code
23
GRAtt-VIS (ResNet-50)
36.8
Yes
GRAtt-VIS: Gated Residual Attention for Auto Rec...
2023-05-26
Code
24
UNINEXT (ResNet-50, Online)
35.6
Yes
Universal Instance Perception as Object Discover...
2023-03-12
Code
25
CTVIS (ResNet-50)
34.9
Yes
CTVIS: Consistent Training for Online Video Inst...
2023-07-24
Code
26
TarViS (Swin-T)
34.4
Yes
TarViS: A Unified Approach for Target-based Vide...
2023-01-06
Code
27
NOVIS (ResNet-50)
32.6
Yes
NOVIS: A Case for End-to-End Near-Online Video I...
2023-08-29
-
28
TarViS (ResNet-50)
30.4
Yes
TarViS: A Unified Approach for Target-based Vide...
2023-01-06
Code
29
Tube-Link(ResNet-50)
30.2
No
Tube-Link: A Flexible Cross Tube Framework for U...
2023-03-22
Code
30
IDOL (ResNet-50)
30
No
In Defense of Online Models for Video Instance S...
2022-07-21
Code
31
VITA (Swin-L)
24.9
Yes
VITA: Video Instance Segmentation via Object Tok...
2022-06-09
Code
32
InstanceFormer (Swin-L)
21.61
Yes
InstanceFormer: An Online Video Instance Segment...
2022-08-22
Code
33
DeVIS (ResNet-50)
20.8
No
DeVIS: Making Deformable Transformers Work for V...
2022-07-22
Code
34
InstanceFormer(ResNet-50)
18.1
Yes
InstanceFormer: An Online Video Instance Segment...
2022-08-22
Code
35
CrossVIS (ResNet-50, calibration)
16.9
No
Crossover Learning for Fast Online Video Instanc...
2021-04-13
Code
36
STMask(R101-DCN-FPN)
15.2
No
Spatial Feature Calibration and Temporal Fusion ...
2021-04-06
Code
37
TeViT (ResNet-50)
15
No
Temporally Efficient Vision Transformer for Vide...
2022-04-18
Code
38
Mask2Former-VIS
14.1
No
Mask2Former for Video Instance Segmentation
2021-12-20
Code
39
D2Conv3D (ResNet-50)
13.7
No
-
-
Code
40
STC (ResNet-50)
13.4
No
STC: Spatio-Temporal Contrastive Learning for Vi...
2022-02-08
-
41
CMaskTrack R-CNN (ResNet-50)
13.1
No
Occluded Video Instance Segmentation: A Benchmark
2021-02-02
Code
42
CSipMask (ResNet-50)
12.5
No
Occluded Video Instance Segmentation: A Benchmark
2021-02-02
Code
43
CrossVIS (ResNet-50)
12.1
No
Crossover Learning for Fast Online Video Instanc...
2021-04-13
Code