TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

SotA/Computer Vision/Instance Segmentation/COCO test-dev

Instance Segmentation on COCO test-dev

Metric: mask AP (higher is better)

LeaderboardDataset
Loading chart...

Results

Submit a result
#Model↕mask AP▼Extra DataPaperDate↕Code
1Co-DETR57.1YesDETRs with Collaborative Hybrid Assignments Trai...2022-11-22Code
2CBNetV2 (EVA02, single-scale)56.1YesCBNet: A Composite Backbone Network Architecture...2021-07-01Code
3EVA55.5YesEVA: Exploring the Limits of Masked Visual Repre...2022-11-14Code
4FD-SwinV2-G55.4YesContrastive Learning Rivals Masked Image Modelin...2022-05-27Code
5Mask Frozen-DETR55.3YesMask Frozen-DETR: High Quality Instance Segmenta...2023-08-07-
6BEiT-354.8NoImage as a Foreign Language: BEiT Pretraining fo...2022-08-22Code
7MasK DINO (SwinL, multi-scale)54.7YesMask DINO: Towards A Unified Transformer-based F...2022-06-06Code
8ViT-Adapter-L (HTC++, BEiTv2, O365, multi-scale)54.5YesVision Transformer Adapter for Dense Predictions2022-05-17Code
9GLEE-Pro54.5YesGeneral Object Foundation Model for Images and V...2023-12-14Code
10SwinV2-G (HTC++)54.4YesSwin Transformer V2: Scaling Up Capacity and Res...2021-11-18Code
11GLEE-Plus53.3YesGeneral Object Foundation Model for Images and V...2023-12-14Code
12Soft Teacher + Swin-L (HTC++, multi-scale)53YesEnd-to-End Semi-Supervised Object Detection with...2021-06-16Code
13ViT-Adapter-L (HTC++, BEiTv2 pretrain, multi-scale)53NoVision Transformer Adapter for Dense Predictions2022-05-17Code
14Mask DINO (SwinL, single -scale)52.8NoMask DINO: Towards A Unified Transformer-based F...2022-06-06Code
15ViT-Adapter-L (HTC++, BEiT pretrain, multi-scale)52.5NoVision Transformer Adapter for Dense Predictions2022-05-17Code
16CBNetV2 (Dual-Swin-L HTC, multi-scale)52.3NoCBNet: A Composite Backbone Network Architecture...2021-07-01Code
17UNINEXT-H51.8YesUniversal Instance Perception as Object Discover...2023-03-12Code
18CBNetV2 (Dual-Swin-L HTC, single-scale)51.6NoCBNet: A Composite Backbone Network Architecture...2021-07-01Code
19Focal-L (HTC++, multi-scale)51.3NoFocal Self-attention for Local-Global Interactio...2021-07-01Code
20Swin-L (HTC++, multi scale)51.1NoSwin Transformer: Hierarchical Vision Transforme...2021-03-25Code
21Mask2Former (Swin-L, single scale)50.5NoMasked-attention Mask Transformer for Universal ...2021-12-02Code
22Swin-L (HTC++, single scale)50.2NoSwin Transformer: Hierarchical Vision Transforme...2021-03-25Code
23ISTR-SMT (Swin-L, single scale)49.7NoISTR: End-to-End Instance Segmentation with Tran...2021-05-03Code
24QueryInst (single scale)49.1NoInstances as Queries2021-05-05Code
25Cascade Eff-B7 NAS-FPN (1280, self-training Copy Paste, single-scale)49.1YesSimple Copy-Paste is a Strong Data Augmentation ...2020-12-13Code
26dBOT ViT-L (CLIP)48.8NoExploring Target Representations for Masked Auto...2022-09-08Code
27MogaNet-XL (Cascade Mask R-CNN)48.8NoMogaNet: Multi-order Gated Aggregation Network2022-11-07Code
28DetectoRS (ResNeXt-101-64x4d, multi-scale)48.5NoDetectoRS: Detecting Objects with Recursive Feat...2020-06-03Code
29dBOT ViT-L48.3NoExploring Target Representations for Masked Auto...2022-09-08Code
30DiffusionInst-SwinL48.3NoDiffusionInst: Diffusion Model for Instance Segm...2022-12-06Code
31GLEE-Lite48.3YesGeneral Object Foundation Model for Images and V...2023-12-14Code
32DiffusionInst-SwinB47.6NoDiffusionInst: Diffusion Model for Instance Segm...2022-12-06Code
33DetectoRS (ResNeXt-101-32x4d, multi-scale)47.1NoDetectoRS: Detecting Objects with Recursive Feat...2020-06-03Code
34Cascade Eff-B7 NAS-FPN (1280)46.9NoSimple Copy-Paste is a Strong Data Augmentation ...2020-12-13Code
35SOLQ (Swin-L, single scale)46.7NoSOLQ: Segmenting Objects by Learning Queries2021-06-04Code
36dBOT ViT-B46.3NoExploring Target Representations for Masked Auto...2022-09-08Code
37dBOT ViT-B (CLIP)46.2NoExploring Target Representations for Masked Auto...2022-09-08Code
38Mask R-CNN (SpineNet-190, 1536x1536)46.1NoSpineNet: Learning Scale-Permuted Backbone for R...2019-12-10Code
39MogaNet-L (Cascade Mask R-CNN)46.1NoMogaNet: Multi-order Gated Aggregation Network2022-11-07Code
40MogaNet-B (Cascade Mask R-CNN)46NoMogaNet: Multi-order Gated Aggregation Network2022-11-07Code
41Swin-B + Cascade Mask R-CNN (tri-layer modelling)45.9NoA Tri-Layer Plugin to Improve Occluded Detection2022-10-18Code
42GCNet (ResNeXt-101 + DCN + cascade + GC r4)45.4NoGlobal Context Networks2020-12-24Code
43MogaNet-S (Cascade Mask R-CNN)45.1NoMogaNet: Multi-order Gated Aggregation Network2022-11-07Code
44gSwin-S45.03NogSwin: Gated MLP Vision Model with Hierarchical ...2022-08-24-
45iBOT (ViT-B/16)44.2YesiBOT: Image BERT Pre-Training with Online Tokeni...2021-11-15Code
46gSwin-T44.16NogSwin: Gated MLP Vision Model with Hierarchical ...2022-08-24-
47MogaNet-L (Mask R-CNN 1x)44.1NoMogaNet: Multi-order Gated Aggregation Network2022-11-07Code
48A2MIM (ViT-B)43.5NoArchitecture-Agnostic Masked Image Modeling -- F...2022-05-27Code
49Cascade Mask R-CNN (ResNeXt152, CBNet)43.3NoCBNet: A Novel Composite Backbone Network Archit...2019-09-09Code
50MogaNet-B (Mask R-CNN 1x)43.2NoMogaNet: Multi-order Gated Aggregation Network2022-11-07Code
51gSwin-VT42.87NogSwin: Gated MLP Vision Model with Hierarchical ...2022-08-24-
52iBOT (ViT-S/16)42.6YesiBOT: Image BERT Pre-Training with Online Tokeni...2021-11-15Code
53Box2Mask-T42.4NoBox2Mask: Box-supervised Instance Segmentation v...2022-12-03Code
54Mask Transfiner(ResNet101-FPN)42.2NoMask Transfiner for High-Quality Instance Segmen...2021-11-26Code
55MogaNet-S (Mask R-CNN 1x)42.2NoMogaNet: Multi-order Gated Aggregation Network2022-11-07Code
56PANet42NoPath Aggregation Network for Instance Segmentation2018-03-05Code
57CenterMask + VoVNet9941.8NoCenterMask : Real-Time Anchor-Free Instance Segm...2019-11-15Code
58SOLOv2(Res-DCN-101-FPN)41.7NoSOLOv2: Dynamic and Fast Instance Segmentation2020-03-23Code
59BCNet(ResNeXt-101 + FPN+ FCOS)41.7NoDeep Occlusion-Aware Instance Segmentation with ...2021-03-23Code
60DiffusionInst-ResNet10141.5NoDiffusionInst: Diffusion Model for Instance Segm...2022-12-06Code
61BlendMask (ResNet-101 + DCN interval=3)41.3NoBlendMask: Top-Down Meets Bottom-Up for Instance...2020-01-02Code
62HTC + ResNeXt-101-FPN + DCN41.2YesHybrid Task Cascade for Instance Segmentation2019-01-22Code
63SOLQ (ResNet101, single scale)40.9NoSOLQ: Segmenting Objects by Learning Queries2021-06-04Code
64CenterMask + VoVNetV2-99 (single-scale)40.6NoCenterMask : Real-Time Anchor-Free Instance Segm...2019-11-15Code
65SOLO(Res-DCN-101-FPN)40.4NoSOLO: Segmenting Objects by Locations2019-12-10Code
66D2Det (ResNet-101, single-scale test)40.2No--Code
67BoxTeacher40NoBoxTeacher: Exploring High-Quality Pseudo Labels...2022-10-11Code
68BCNet(ResNet-101-FPN + Faster RCNN)39.8NoDeep Occlusion-Aware Instance Segmentation with ...2021-03-23Code
69SOLQ (ResNet50, single scale)39.7NoSOLQ: Segmenting Objects by Learning Queries2021-06-04Code
70CenterMask + X101-32x8d (single-scale)39.6NoCenterMask : Real-Time Anchor-Free Instance Segm...2019-11-15Code
71BCNet(ResNet-101-FPN + FCOS)39.6NoDeep Occlusion-Aware Instance Segmentation with ...2021-03-23Code
72CPMask39.2NoCommonality-Parsing Network across Shape and App...2020-07-24Code
73MogaNet-T (Mask R-CNN 1x)39.1NoMogaNet: Multi-order Gated Aggregation Network2022-11-07Code
74PolarMask++ (ResNeXt-101-DCN)38.7YesPolarMask++: Enhanced Polar Representation for S...2021-05-05Code
75ISDA (ours)38.7NoISDA: Position-Aware Instance Segmentation with ...2022-02-23Code
76CenterMask + ResNet-101-FPN38.3NoCenterMask : Real-Time Anchor-Free Instance Segm...2019-11-15Code
77SipMask (ResNet-101, single-scale test)38.1NoSipMask: Spatial Information Preservation for Fa...2020-07-29Code
78DiscoBox37.9NoDiscoBox: Weakly Supervised Instance Segmentatio...2021-05-13Code
79EmbedMask(R-101-FPN)37.7NoEmbedMask: Embedding Coupling for One-stage Inst...2019-12-04Code
80MogaNet-XT37.6NoMogaNet: Multi-order Gated Aggregation Network2022-11-07Code
81Mask R-CNN (ResNeXt-101-FPN)37.1NoMask R-CNN2017-03-20Code
82DiffusionInst-ResNet5037.1NoDiffusionInst: Diffusion Model for Instance Segm...2022-12-06Code
83VirTex Mask R-CNN (ResNet-50-FPN)36.9NoVirTex: Learning Visual Representations from Tex...2020-06-11Code
84MogaNet-T35.8NoMogaNet: Multi-order Gated Aggregation Network2022-11-07Code
85BoxInst35NoBoxInst: High-Performance Instance Segmentation ...2020-12-03Code
86A2MIM (ResNet-50 2x)34.9NoArchitecture-Agnostic Masked Image Modeling -- F...2022-05-27Code
87E2EC DLA-3433.8NoE2EC: An End-to-End Contour-based Method for Hig...2022-03-08Code
88Mask R-CNN (Bottleneck-injected ResNet-50, FPN)33.6Notorchdistill: A Modular, Configuration-Driven Fr...2020-11-25Code
89BoxCaseg30.9YesWeakly-supervised Instance Segmentation via Clas...2021-04-04-
90BBAM25.7NoBBAM: Bounding Box Attribution Map for Weakly Su...2021-03-16Code
91BBTP21.1No--Code