TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

SotA/Computer Vision/Video Instance Segmentation/YouTube-VIS validation

Video Instance Segmentation on YouTube-VIS validation

Metric: AR10 (higher is better)

LeaderboardDataset
Loading chart...

Results

Submit a result
#Model↕AR10▼Extra DataPaperDate↕Code
1DVIS++(ViT-L, Online)73.7YesDVIS++: Improved Decoupled Framework for Univers...2023-12-20Code
2CAVIS(ViT-L, Online)73.6YesContext-Aware Video Instance Segmentation2024-07-03Code
3DVIS70.3YesDVIS: Decoupled Video Instance Segmentation Fram...2023-06-06Code
4Tube-Link69.1NoTube-Link: A Flexible Cross Tube Framework for U...2023-03-22Code
5UniVS(Swin-L)66.8YesUniVS: Unified and Universal Video Segmentation ...2024-02-28Code
6MinVIS (Swin-L)66.6NoMinVIS: A Minimal Video Instance Segmentation Fr...2022-08-03Code
7MDQE(Swin-L)65NoMDQE: Mining Discriminative Query Embeddings to ...2023-03-25Code
8SeqFormer (Swin-L)64.4YesSeqFormer: Sequential Transformer for Video Inst...2021-12-15Code
9InstanceFormer(Swin-L)61.6YesInstanceFormer: An Online Video Instance Segment...2022-08-22Code
10DeVIS (Swin-L)61NoDeVIS: Making Deformable Transformers Work for V...2022-07-22Code
11NOVIS (ResNet-50)60.6YesNOVIS: A Case for End-to-End Near-Online Video I...2023-08-29-
12Video K-Net (Swin-Base)59.9NoVideo K-Net: A Simple, Strong, and Unified Basel...2022-04-10Code
13IDOL (ResNet-50)58.7NoIn Defense of Online Models for Video Instance S...2022-07-21Code
14TCIS (Swin-S)57.9No1st Place Solution for YouTubeVOS Challenge 2021...2021-06-12-
15SeqFormer (ResNet-101)56.9YesSeqFormer: Sequential Transformer for Video Inst...2021-12-15Code
16MSN55NoMSN: Efficient Online Mask Selection Network for...2021-06-19Code
17SeqFormer (ResNet-50)54.8YesSeqFormer: Sequential Transformer for Video Inst...2021-12-15Code
18SeqFormer (ResNet-50)54.6NoSeqFormer: Sequential Transformer for Video Inst...2021-12-15Code
19InstanceFormer(ResNet-50)53.5YesInstanceFormer: An Online Video Instance Segment...2022-08-22Code
20DeVIS (ResNet-50)51.6NoDeVIS: Making Deformable Transformers Work for V...2022-07-22Code
21IFC (ResNet-50)51.2NoVideo Instance Segmentation using Inter-Frame Co...2021-06-07Code
22ObjProp (ResNet-50)47.7NoObject Propagation via Inter-Frame Attentions fo...2021-11-15Code
23VisTR(ResNet-101)44.9NoEnd-to-End Video Instance Segmentation with Tran...2020-11-30Code
24STC (ResNet-50)44.5NoSTC: Spatio-Temporal Contrastive Learning for Vi...2022-02-08-
25VisTR(ResNet-50)42.4NoEnd-to-End Video Instance Segmentation with Tran...2020-11-30Code
26CrossVIS (ResNet-101)42NoCrossover Learning for Fast Online Video Instanc...2021-04-13Code
27STMask(R101-DCN-FPN)41.8NoSpatial Feature Calibration and Temporal Fusion ...2021-04-06Code
28PCAN(ResNet-50)41.6NoPrototypical Cross-Attention Networks for Multip...2021-06-22Code
29STEm-Seg (ResNet-101)41.6NoSTEm-Seg: Spatio-temporal Embeddings for Instanc...2020-03-18Code
30STEm-Seg (ResNet-50)41.6NoSTEm-Seg: Spatio-temporal Embeddings for Instanc...2020-03-18Code
31CompFeat(ResNet-50)40.3NoCompFeat: Comprehensive Feature Aggregation for ...2020-12-07Code
32SipMask (ResNet-50, ms-train, single-scale test)40.1NoSipMask: Spatial Information Preservation for Fa...2020-07-29Code
33SipMask (ResNet-50, single-scale test)38.9NoSipMask: Spatial Information Preservation for Fa...2020-07-29Code
34MaskTrack R-CNN (ResNet-50, single-scale training and test)35.5NoVideo Instance Segmentation2019-05-12Code