Tasks
SotA
Datasets
Papers
Methods
Submit
About
SotA
/
Computer Vision
/
Video Instance Segmentation
/
YouTube-VIS validation
Video Instance Segmentation on YouTube-VIS validation
Metric: mask AP (higher is better)
Leaderboard
Dataset
Loading chart...
Results
Submit a result
Hide extra data
Export CSV
#
Model
↕
mask AP
▼
Extra Data
Paper
Date
↕
Code
1
CAVIS(ViT-L, Online)
68.9
Yes
Context-Aware Video Instance Segmentation
2024-07-03
Code
2
DVIS++(ViT-L, Online)
67.7
Yes
DVIS++: Improved Decoupled Framework for Univers...
2023-12-20
Code
3
DVIS
64.9
Yes
DVIS: Decoupled Video Instance Segmentation Fram...
2023-06-06
Code
4
Tube-Link
64.6
No
Tube-Link: A Flexible Cross Tube Framework for U...
2023-03-22
Code
5
MinVIS (Swin-L)
61.6
No
MinVIS: A Minimal Video Instance Segmentation Fr...
2022-08-03
Code
6
Mask2Former (Swin-L)
60.4
No
Mask2Former for Video Instance Segmentation
2021-12-20
Code
7
UniVS(Swin-L)
60
Yes
UniVS: Unified and Universal Video Segmentation ...
2024-02-28
Code
8
MDQE(Swin-L)
59.9
No
MDQE: Mining Discriminative Query Embeddings to ...
2023-03-25
Code
9
SeqFormer (Swin-L)
59.3
Yes
SeqFormer: Sequential Transformer for Video Inst...
2021-12-15
Code
10
DeVIS (Swin-L)
57.1
No
DeVIS: Making Deformable Transformers Work for V...
2022-07-22
Code
11
InstanceFormer(Swin-L)
56.3
Yes
InstanceFormer: An Online Video Instance Segment...
2022-08-22
Code
12
TCIS (Swin-S)
54.3
No
1st Place Solution for YouTubeVOS Challenge 2021...
2021-06-12
-
13
Video K-Net (Swin-Base)
54.1
No
Video K-Net: A Simple, Strong, and Unified Basel...
2022-04-10
Code
14
NOVIS (ResNet-50)
52.8
Yes
NOVIS: A Case for End-to-End Near-Online Video I...
2023-08-29
-
15
IDOL (ResNet-50)
49.5
No
In Defense of Online Models for Video Instance S...
2022-07-21
Code
16
Mask2Former (ResNet-101)
49.2
No
Mask2Former for Video Instance Segmentation
2021-12-20
Code
17
SeqFormer (ResNet-101)
49
Yes
SeqFormer: Sequential Transformer for Video Inst...
2021-12-15
Code
18
MSN
48.8
No
MSN: Efficient Online Mask Selection Network for...
2021-06-19
Code
19
SeqFormer (ResNet-50)
47.4
Yes
SeqFormer: Sequential Transformer for Video Inst...
2021-12-15
Code
20
Mask2Former (ResNet-50)
46.4
No
Mask2Former for Video Instance Segmentation
2021-12-20
Code
21
InstanceFormer(ResNet-50)
45.6
Yes
InstanceFormer: An Online Video Instance Segment...
2022-08-22
Code
22
SeqFormer (ResNet-50)
45.1
No
SeqFormer: Sequential Transformer for Video Inst...
2021-12-15
Code
23
DeVIS (ResNet-50)
44.4
No
DeVIS: Making Deformable Transformers Work for V...
2022-07-22
Code
24
IFC (ResNet-50)
42.8
No
Video Instance Segmentation using Inter-Frame Co...
2021-06-07
Code
25
VisTR(ResNet-101)
40.1
No
End-to-End Video Instance Segmentation with Tran...
2020-11-30
Code
26
VSTAM
39
No
-
-
Code
27
STMask(R101-DCN-FPN)
36.8
No
Spatial Feature Calibration and Temporal Fusion ...
2021-04-06
Code
28
STC (ResNet-50)
36.7
No
STC: Spatio-Temporal Contrastive Learning for Vi...
2022-02-08
-
29
CrossVIS (ResNet-101)
36.6
No
Crossover Learning for Fast Online Video Instanc...
2021-04-13
Code
30
VisTR(ResNet-50)
36.2
No
End-to-End Video Instance Segmentation with Tran...
2020-11-30
Code
31
PCAN(ResNet-50)
36.1
No
Prototypical Cross-Attention Networks for Multip...
2021-06-22
Code
32
ObjProp (ResNet-50)
36
No
Object Propagation via Inter-Frame Attentions fo...
2021-11-15
Code
33
CompFeat(ResNet-50)
35.3
No
CompFeat: Comprehensive Feature Aggregation for ...
2020-12-07
Code
34
CSipMask
35.1
No
Occluded Video Instance Segmentation: A Benchmark
2021-02-02
Code
35
STEm-Seg (ResNet-101)
34.6
No
STEm-Seg: Spatio-temporal Embeddings for Instanc...
2020-03-18
Code
36
SipMask (ResNet-50, ms-train, single-scale test)
33.7
No
SipMask: Spatial Information Preservation for Fa...
2020-07-29
Code
37
TraDeS
32.6
No
Track to Detect and Segment: An Online Multi-Obj...
2021-03-16
Code
38
SipMask (ResNet-50, single-scale test)
32.5
No
SipMask: Spatial Information Preservation for Fa...
2020-07-29
Code
39
CMaskTrack R-CNN
32.1
No
Occluded Video Instance Segmentation: A Benchmark
2021-02-02
Code
40
STEm-Seg (ResNet-50)
30.6
No
STEm-Seg: Spatio-temporal Embeddings for Instanc...
2020-03-18
Code
41
MaskTrack R-CNN (ResNet-50, single-scale training and test)
30.3
No
Video Instance Segmentation
2019-05-12
Code
42
UniTrack
30.1
No
Do Different Tracking Tasks Require Different Ap...
2021-07-05
Code
43
OSMN
29.1
No
Efficient Video Object Segmentation via Network ...
2018-02-04
Code
44
DeepSORT
27.8
No
Simple Online and Realtime Tracking with a Deep ...
2017-03-21
Code