TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

SotA/Computer Vision/Video Instance Segmentation/YouTube-VIS validation

Video Instance Segmentation on YouTube-VIS validation

Metric: AR1 (higher is better)

LeaderboardDataset
Loading chart...

Results

Submit a result
#Model↕AR1▼Extra DataPaperDate↕Code
1CAVIS(ViT-L, Online)58.3YesContext-Aware Video Instance Segmentation2024-07-03Code
2DVIS++(ViT-L, Online)57.9YesDVIS++: Improved Decoupled Framework for Univers...2023-12-20Code
3DVIS56.5YesDVIS: Decoupled Video Instance Segmentation Fram...2023-06-06Code
4Tube-Link55.9NoTube-Link: A Flexible Cross Tube Framework for U...2023-03-22Code
5MinVIS (Swin-L)54.8NoMinVIS: A Minimal Video Instance Segmentation Fr...2022-08-03Code
6UniVS(Swin-L)54.7YesUniVS: Unified and Universal Video Segmentation ...2024-02-28Code
7MDQE(Swin-L)53.5NoMDQE: Mining Discriminative Query Embeddings to ...2023-03-25Code
8SeqFormer (Swin-L)51.7YesSeqFormer: Sequential Transformer for Video Inst...2021-12-15Code
9InstanceFormer(Swin-L)50.9YesInstanceFormer: An Online Video Instance Segment...2022-08-22Code
10DeVIS (Swin-L)50.8NoDeVIS: Making Deformable Transformers Work for V...2022-07-22Code
11NOVIS (ResNet-50)50.3YesNOVIS: A Case for End-to-End Near-Online Video I...2023-08-29-
12Video K-Net (Swin-Base)49.7NoVideo K-Net: A Simple, Strong, and Unified Basel...2022-04-10Code
13IDOL (ResNet-50)47.7NoIn Defense of Online Models for Video Instance S...2022-07-21Code
14TCIS (Swin-S)47No1st Place Solution for YouTubeVOS Challenge 2021...2021-06-12-
15SeqFormer (ResNet-101)46.8YesSeqFormer: Sequential Transformer for Video Inst...2021-12-15Code
16SeqFormer (ResNet-50)45.6NoSeqFormer: Sequential Transformer for Video Inst...2021-12-15Code
17SeqFormer (ResNet-50)45.5YesSeqFormer: Sequential Transformer for Video Inst...2021-12-15Code
18IFC (ResNet-50)43.8NoVideo Instance Segmentation using Inter-Frame Co...2021-06-07Code
19DeVIS (ResNet-50)42.4NoDeVIS: Making Deformable Transformers Work for V...2022-07-22Code
20InstanceFormer(ResNet-50)42.1YesInstanceFormer: An Online Video Instance Segment...2022-08-22Code
21MSN40.1NoMSN: Efficient Online Mask Selection Network for...2021-06-19Code
22ObjProp (ResNet-50)39.1NoObject Propagation via Inter-Frame Attentions fo...2021-11-15Code
23VisTR(ResNet-101)38.3NoEnd-to-End Video Instance Segmentation with Tran...2020-11-30Code
24VisTR(ResNet-50)37.2NoEnd-to-End Video Instance Segmentation with Tran...2020-11-30Code
25STC (ResNet-50)36.9NoSTC: Spatio-Temporal Contrastive Learning for Vi...2022-02-08-
26PCAN(ResNet-50)36.3NoPrototypical Cross-Attention Networks for Multip...2021-06-22Code
27CrossVIS (ResNet-101)36NoCrossover Learning for Fast Online Video Instanc...2021-04-13Code
28SipMask (ResNet-50, ms-train, single-scale test)35.4NoSipMask: Spatial Information Preservation for Fa...2020-07-29Code
29STMask(R101-DCN-FPN)34.8NoSpatial Feature Calibration and Temporal Fusion ...2021-04-06Code
30STEm-Seg (ResNet-101)34.4NoSTEm-Seg: Spatio-temporal Embeddings for Instanc...2020-03-18Code
31STEm-Seg (ResNet-50)34.4NoSTEm-Seg: Spatio-temporal Embeddings for Instanc...2020-03-18Code
32SipMask (ResNet-50, single-scale test)33.5NoSipMask: Spatial Information Preservation for Fa...2020-07-29Code
33CompFeat(ResNet-50)33.1NoCompFeat: Comprehensive Feature Aggregation for ...2020-12-07Code
34MaskTrack R-CNN (ResNet-50, single-scale training and test)31NoVideo Instance Segmentation2019-05-12Code