TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

SotA/Computer Vision/Video Instance Segmentation/YouTube-VIS 2021

Video Instance Segmentation on YouTube-VIS 2021

Metric: mask AP (higher is better)

LeaderboardDataset
Loading chart...

Results

Submit a result
#Model↕mask AP▼Extra DataPaperDate↕Code
1CAVIS(VIT-L, Offline)65.3YesContext-Aware Video Instance Segmentation2024-07-03Code
2DVIS-DAQ(VIT-L, Offline)64.5YesDVIS-DAQ: Improving Video Segmentation via Dynam...2024-03-29Code
3DVIS++(VIT-L, Offline)63.9YesDVIS++: Improved Decoupled Framework for Univers...2023-12-20Code
4DVIS++(VIT-L, Online)62.3YesDVIS++: Improved Decoupled Framework for Univers...2023-12-20Code
5RefineVIS (Swin-L, online)61.4YesRefineVIS: Video Instance Segmentation with Temp...2023-06-07-
6GRAtt-VIS (Swin-L)60.3YesGRAtt-VIS: Gated Residual Attention for Auto Rec...2023-05-26Code
7TarViS (Swin-L)60.2YesTarViS: A Unified Approach for Target-based Vide...2023-01-06Code
8DVIS(Swin-L)60.1YesDVIS: Decoupled Video Instance Segmentation Fram...2023-06-06Code
9GenVIS (Swin-L)60.1YesA Generalized Framework for Video Instance Segme...2022-11-16Code
10NOVIS (Swin-L)59.8YesNOVIS: A Case for End-to-End Near-Online Video I...2023-08-29-
11Tube-Link(Swin-L)58.4NoTube-Link: A Flexible Cross Tube Framework for U...2023-03-22Code
12UniVS(Swin-L)57.9YesUniVS: Unified and Universal Video Segmentation ...2024-02-28Code
13VITA (Swin-L)57.5YesVITA: Video Instance Segmentation via Object Tok...2022-06-09Code
14IDOL (Swin-L)56.1NoIn Defense of Online Models for Video Instance S...2022-07-21Code
15MDQE(Swin-L)55.5NoMDQE: Mining Discriminative Query Embeddings to ...2023-03-25Code
16MinVIS (Swin-L)55.3NoMinVIS: A Minimal Video Instance Segmentation Fr...2022-08-03Code
17DeVIS (Swin-L)54.4NoDeVIS: Making Deformable Transformers Work for V...2022-07-22Code
18BoxVIS(Swin-L & Box-sup)53.9NoBoxVIS: Video Instance Segmentation with Box Ann...2023-03-26Code
19InstanceFormer (Swin-L)51YesInstanceFormer: An Online Video Instance Segment...2022-08-22Code
20TarViS (Swin-T)50.9YesTarViS: A Unified Approach for Target-based Vide...2023-01-06Code
21GRAtt-VIS (ResNet-50)48.9YesGRAtt-VIS: Gated Residual Attention for Auto Rec...2023-05-26Code
22TarViS (ResNet-50)48.3YesTarViS: A Unified Approach for Target-based Vide...2023-01-06Code
23NOVIS (ResNet-50)47.2YesNOVIS: A Case for End-to-End Near-Online Video I...2023-08-29-
24DeVIS (ResNet-50)43.1NoDeVIS: Making Deformable Transformers Work for V...2022-07-22Code
25InstanceFormer (ResNet-50)40.8YesInstanceFormer: An Online Video Instance Segment...2022-08-22Code
26STMask(R101-DCN-FPN)34.6NoSpatial Feature Calibration and Temporal Fusion ...2021-04-06Code