TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

SotA/Computer Vision/Video Instance Segmentation/YouTube-VIS validation

Video Instance Segmentation on YouTube-VIS validation

Metric: mask AP (higher is better)

LeaderboardDataset
Loading chart...

Results

Submit a result
#Model↕mask AP▼Extra DataPaperDate↕Code
1CAVIS(ViT-L, Online)68.9YesContext-Aware Video Instance Segmentation2024-07-03Code
2DVIS++(ViT-L, Online)67.7YesDVIS++: Improved Decoupled Framework for Univers...2023-12-20Code
3DVIS64.9YesDVIS: Decoupled Video Instance Segmentation Fram...2023-06-06Code
4Tube-Link64.6NoTube-Link: A Flexible Cross Tube Framework for U...2023-03-22Code
5MinVIS (Swin-L)61.6NoMinVIS: A Minimal Video Instance Segmentation Fr...2022-08-03Code
6Mask2Former (Swin-L)60.4NoMask2Former for Video Instance Segmentation2021-12-20Code
7UniVS(Swin-L)60YesUniVS: Unified and Universal Video Segmentation ...2024-02-28Code
8MDQE(Swin-L)59.9NoMDQE: Mining Discriminative Query Embeddings to ...2023-03-25Code
9SeqFormer (Swin-L)59.3YesSeqFormer: Sequential Transformer for Video Inst...2021-12-15Code
10DeVIS (Swin-L)57.1NoDeVIS: Making Deformable Transformers Work for V...2022-07-22Code
11InstanceFormer(Swin-L)56.3YesInstanceFormer: An Online Video Instance Segment...2022-08-22Code
12TCIS (Swin-S)54.3No1st Place Solution for YouTubeVOS Challenge 2021...2021-06-12-
13Video K-Net (Swin-Base)54.1NoVideo K-Net: A Simple, Strong, and Unified Basel...2022-04-10Code
14NOVIS (ResNet-50)52.8YesNOVIS: A Case for End-to-End Near-Online Video I...2023-08-29-
15IDOL (ResNet-50)49.5NoIn Defense of Online Models for Video Instance S...2022-07-21Code
16Mask2Former (ResNet-101)49.2NoMask2Former for Video Instance Segmentation2021-12-20Code
17SeqFormer (ResNet-101)49YesSeqFormer: Sequential Transformer for Video Inst...2021-12-15Code
18MSN48.8NoMSN: Efficient Online Mask Selection Network for...2021-06-19Code
19SeqFormer (ResNet-50)47.4YesSeqFormer: Sequential Transformer for Video Inst...2021-12-15Code
20Mask2Former (ResNet-50)46.4NoMask2Former for Video Instance Segmentation2021-12-20Code
21InstanceFormer(ResNet-50)45.6YesInstanceFormer: An Online Video Instance Segment...2022-08-22Code
22SeqFormer (ResNet-50)45.1NoSeqFormer: Sequential Transformer for Video Inst...2021-12-15Code
23DeVIS (ResNet-50)44.4NoDeVIS: Making Deformable Transformers Work for V...2022-07-22Code
24IFC (ResNet-50)42.8NoVideo Instance Segmentation using Inter-Frame Co...2021-06-07Code
25VisTR(ResNet-101)40.1NoEnd-to-End Video Instance Segmentation with Tran...2020-11-30Code
26VSTAM39No--Code
27STMask(R101-DCN-FPN)36.8NoSpatial Feature Calibration and Temporal Fusion ...2021-04-06Code
28STC (ResNet-50)36.7NoSTC: Spatio-Temporal Contrastive Learning for Vi...2022-02-08-
29CrossVIS (ResNet-101)36.6NoCrossover Learning for Fast Online Video Instanc...2021-04-13Code
30VisTR(ResNet-50)36.2NoEnd-to-End Video Instance Segmentation with Tran...2020-11-30Code
31PCAN(ResNet-50)36.1NoPrototypical Cross-Attention Networks for Multip...2021-06-22Code
32ObjProp (ResNet-50)36NoObject Propagation via Inter-Frame Attentions fo...2021-11-15Code
33CompFeat(ResNet-50)35.3NoCompFeat: Comprehensive Feature Aggregation for ...2020-12-07Code
34CSipMask35.1NoOccluded Video Instance Segmentation: A Benchmark2021-02-02Code
35STEm-Seg (ResNet-101)34.6NoSTEm-Seg: Spatio-temporal Embeddings for Instanc...2020-03-18Code
36SipMask (ResNet-50, ms-train, single-scale test)33.7NoSipMask: Spatial Information Preservation for Fa...2020-07-29Code
37TraDeS32.6NoTrack to Detect and Segment: An Online Multi-Obj...2021-03-16Code
38SipMask (ResNet-50, single-scale test)32.5NoSipMask: Spatial Information Preservation for Fa...2020-07-29Code
39CMaskTrack R-CNN32.1NoOccluded Video Instance Segmentation: A Benchmark2021-02-02Code
40STEm-Seg (ResNet-50)30.6NoSTEm-Seg: Spatio-temporal Embeddings for Instanc...2020-03-18Code
41MaskTrack R-CNN (ResNet-50, single-scale training and test)30.3NoVideo Instance Segmentation2019-05-12Code
42UniTrack30.1NoDo Different Tracking Tasks Require Different Ap...2021-07-05Code
43OSMN29.1NoEfficient Video Object Segmentation via Network ...2018-02-04Code
44DeepSORT27.8NoSimple Online and Realtime Tracking with a Deep ...2017-03-21Code