TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

SotA/Computer Vision/Video Instance Segmentation/OVIS validation

Video Instance Segmentation on OVIS validation

Metric: mask AP (higher is better)

LeaderboardDataset
Loading chart...

Results

Submit a result
#Model↕mask AP▼Extra DataPaperDate↕Code
1DVIS-DAQ(VIT-L, Offline)57.1YesDVIS-DAQ: Improving Video Segmentation via Dynam...2024-03-29Code
2CAVIS(VIT-L, Offline)57.1YesContext-Aware Video Instance Segmentation2024-07-03Code
3DVIS++(VIT-L,Offline)53.4YesDVIS++: Improved Decoupled Framework for Univers...2023-12-20Code
4GLEE-Pro50.4YesGeneral Object Foundation Model for Images and V...2023-12-14Code
5DVIS(Swin-L, Offline)49.9NoDVIS: Decoupled Video Instance Segmentation Fram...2023-06-06Code
6DVIS++(VIT-L, Online)49.6YesDVIS++: Improved Decoupled Framework for Univers...2023-12-20Code
7UNINEXT (ViT-H, Online)49YesUniversal Instance Perception as Object Discover...2023-03-12Code
8DVIS(Swin-L, Online)47.1NoDVIS: Decoupled Video Instance Segmentation Fram...2023-06-06Code
9CTVIS (Swin-L)46.9YesCTVIS: Consistent Training for Online Video Inst...2023-07-24Code
10RefineVIS (Swin-L, offline)46YesRefineVIS: Video Instance Segmentation with Temp...2023-06-07-
11GRAtt-VIS (Swin-L)45.7YesGRAtt-VIS: Gated Residual Attention for Auto Rec...2023-05-26Code
12GenVIS (Swin-L)45.4YesA Generalized Framework for Video Instance Segme...2022-11-16Code
13NOVIS (Swin-L)43.5YesNOVIS: A Case for End-to-End Near-Online Video I...2023-08-29-
14TarViS (Swin-L)43.2YesTarViS: A Unified Approach for Target-based Vide...2023-01-06Code
15MDQE(SwinL)42.6NoMDQE: Mining Discriminative Query Embeddings to ...2023-03-25Code
16IDOL (Swin-L)42.6NoIn Defense of Online Models for Video Instance S...2022-07-21Code
17ROVIS (Swin-L)42.6NoRobust Online Video Instance Segmentation with T...2022-11-16Code
18UniVS(Swin-L)41.7YesUniVS: Unified and Universal Video Segmentation ...2024-02-28Code
19DVIS++(R50, Offline)41.2YesDVIS++: Improved Decoupled Framework for Univers...2023-12-20Code
20BoxVIS(Swin-L & Box-sup)40.6NoBoxVIS: Video Instance Segmentation with Box Ann...2023-03-26Code
21MinVIS (Swin-L)39.4NoMinVIS: A Minimal Video Instance Segmentation Fr...2022-08-03Code
22DVIS++(R50, Online)37.2YesDVIS++: Improved Decoupled Framework for Univers...2023-12-20Code
23GRAtt-VIS (ResNet-50)36.2YesGRAtt-VIS: Gated Residual Attention for Auto Rec...2023-05-26Code
24CTVIS (ResNet-50)35.5YesCTVIS: Consistent Training for Online Video Inst...2023-07-24Code
25DeVIS (Swin-L)35.5NoDeVIS: Making Deformable Transformers Work for V...2022-07-22Code
26UNINEXT (ResNet-50, Online)34YesUniversal Instance Perception as Object Discover...2023-03-12Code
27TarViS (Swin-T)34YesTarViS: A Unified Approach for Target-based Vide...2023-01-06Code
28NOVIS (ResNet-50)32.7YesNOVIS: A Case for End-to-End Near-Online Video I...2023-08-29-
29TarViS (ResNet-50)31.1YesTarViS: A Unified Approach for Target-based Vide...2023-01-06Code
30IDOL (ResNet-50)30.2NoIn Defense of Online Models for Video Instance S...2022-07-21Code
31Tube-Link(ResNet-50)29.5NoTube-Link: A Flexible Cross Tube Framework for U...2023-03-22Code
32VITA (Swin-L)27.7YesVITA: Video Instance Segmentation via Object Tok...2022-06-09Code
33DeVIS (ResNet-50)23.7NoDeVIS: Making Deformable Transformers Work for V...2022-07-22Code
34InstanceFormer (Swin-L)22.8YesInstanceFormer: An Online Video Instance Segment...2022-08-22Code
35InstanceFormer(ResNet-50)20YesInstanceFormer: An Online Video Instance Segment...2022-08-22Code
36CrossVIS (ResNet-50, calibration)18.1NoCrossover Learning for Fast Online Video Instanc...2021-04-13Code
37TeViT (ResNet-50)17.4NoTemporally Efficient Vision Transformer for Vide...2022-04-18Code
38STMask(R101-DCN-FPN)17.3NoSpatial Feature Calibration and Temporal Fusion ...2021-04-06Code
39Mask2Former-VIS16.6NoMask2Former for Video Instance Segmentation2021-12-20Code
40STC (ResNet-50)15.5NoSTC: Spatio-Temporal Contrastive Learning for Vi...2022-02-08-
41CMaskTrack R-CNN (ResNet-50)15.4NoOccluded Video Instance Segmentation: A Benchmark2021-02-02Code
42D2Conv3D (ResNet-50)15.2No--Code
43CrossVIS (ResNet-50)14.9NoCrossover Learning for Fast Online Video Instanc...2021-04-13Code
44CSipMask (ResNet-50)14.3NoOccluded Video Instance Segmentation: A Benchmark2021-02-02Code