TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Mask R-CNN

Mask R-CNN

Kaiming He, Georgia Gkioxari, Piotr Dollár, Ross Girshick

2017-03-20ICCV 2017 10Multi-tissue Nucleus SegmentationPanoptic Segmentation3D Instance SegmentationHuman Part SegmentationSegmentationReal-Time Object DetectionSemantic SegmentationObject LocalizationPose EstimationMulti-Person Pose EstimationKeypoint DetectionInstance SegmentationMulti-Human ParsingNuclear SegmentationObject DetectionObject SegmentationKeypoint Estimation
PaperPDFCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCode

Abstract

We present a conceptually simple, flexible, and general framework for object instance segmentation. Our approach efficiently detects objects in an image while simultaneously generating a high-quality segmentation mask for each instance. The method, called Mask R-CNN, extends Faster R-CNN by adding a branch for predicting an object mask in parallel with the existing branch for bounding box recognition. Mask R-CNN is simple to train and adds only a small overhead to Faster R-CNN, running at 5 fps. Moreover, Mask R-CNN is easy to generalize to other tasks, e.g., allowing us to estimate human poses in the same framework. We show top results in all three tracks of the COCO suite of challenges, including instance segmentation, bounding-box object detection, and person keypoint detection. Without bells and whistles, Mask R-CNN outperforms all existing, single-model entries on every task, including the COCO 2016 challenge winners. We hope our simple and effective approach will serve as a solid baseline and help ease future research in instance-level recognition. Code has been made available at: https://github.com/facebookresearch/Detectron

Results

TaskDatasetMetricValueModel
Medical Image SegmentationCell17Dice0.707Mask R-CNN
Medical Image SegmentationCell17F1-score0.8004Mask R-CNN
Medical Image SegmentationCell17Hausdorff12.6723Mask R-CNN
Semantic SegmentationCityscapes valPQth54Mask R-CNN+COCO
Object LocalizationGRITLocalization (ablation)44.7Mask R-CNN
Object LocalizationGRITLocalization (test)45.1Mask R-CNN
Pose EstimationCOCO test-devAP63.1Mask-RCNN
Pose EstimationCOCO test-devAP5087.3Mask-RCNN
Pose EstimationCOCO test-devAP7568.7Mask-RCNN
Pose EstimationCOCO test-devAPL71.4Mask-RCNN
Pose EstimationCOCOTest AP63.1Mask R-CNN
Pose EstimationCOCOValidation AP69.2Mask R-CNN
Pose EstimationCOCO test-devAP5087.3Mask R-CNN
Pose EstimationCOCO test-devAP7568.7Mask R-CNN
Pose EstimationCOCO test-devAPL71.4Mask R-CNN
Pose EstimationCOCO test-devAPM57.8Mask R-CNN
Pose EstimationCOCO test-challengeAP68.9Mask R-CNN*
Pose EstimationCOCO test-challengeAP5089.2Mask R-CNN*
Pose EstimationCOCO test-challengeAP7575.2Mask R-CNN*
Pose EstimationCOCO test-challengeAPL82.6Mask R-CNN*
Pose EstimationCOCO test-challengeAR75.4Mask R-CNN*
Pose EstimationCOCO test-challengeAR5093.2Mask R-CNN*
Pose EstimationCOCO test-challengeAR7581.2Mask R-CNN*
Pose EstimationCOCO test-challengeARL76.8Mask R-CNN*
Pose EstimationCOCO test-challengeARM70.2Mask R-CNN*
Pose EstimationCrowdPoseAP Easy69.4Mask R-CNN
Pose EstimationCrowdPoseAP Hard45.8Mask R-CNN
Pose EstimationCrowdPoseAP Medium57.9Mask R-CNN
Pose EstimationCrowdPosemAP @0.5:0.9557.2Mask R-CNN
Pose EstimationOCHumanAP5033.2Mask R-CNN
Pose EstimationOCHumanAP7524.5Mask R-CNN
Pose EstimationOCHumanValidation AP20.2Mask R-CNN
Object DetectionCOCO test-devAP5062.3Mask R-CNN (ResNeXt-101-FPN)
Object DetectionCOCO test-devAP7543.4Mask R-CNN (ResNeXt-101-FPN)
Object DetectionCOCO test-devAPL51.2Mask R-CNN (ResNeXt-101-FPN)
Object DetectionCOCO test-devAPM43.2Mask R-CNN (ResNeXt-101-FPN)
Object DetectionCOCO test-devAPS22.1Mask R-CNN (ResNeXt-101-FPN)
Object DetectionCOCO test-devbox mAP39.8Mask R-CNN (ResNeXt-101-FPN)
Object DetectionCOCO test-devAP5060.3Mask R-CNN (ResNet-101-FPN)
Object DetectionCOCO test-devAP7541.7Mask R-CNN (ResNet-101-FPN)
Object DetectionCOCO test-devAPL50.2Mask R-CNN (ResNet-101-FPN)
Object DetectionCOCO test-devAPM41.1Mask R-CNN (ResNet-101-FPN)
Object DetectionCOCO test-devAPS20.1Mask R-CNN (ResNet-101-FPN)
Object DetectionCOCO test-devbox mAP38.2Mask R-CNN (ResNet-101-FPN)
Object DetectionCOCO-OAverage mAP17.1Mask R-CNN (ResNet-50)
Object DetectionCOCO-OEffective Robustness-0.11Mask R-CNN (ResNet-50)
Object DetectioniSAIDAverage Precision37.18Mask-RCNN+
Object DetectioniSAIDAverage Precision36.5Mask-RCNN
Object DetectionCOCO minivalbox AP40Mask R-CNN (ResNet-101-FPN)
Object DetectionCOCO minivalbox AP37.7Mask R-CNN (ResNet-50-FPN)
Object DetectionCOCO minivalAP5059.5Mask R-CNN (ResNeXt-101-FPN)
Object DetectionCOCO minivalAP7538.9Mask R-CNN (ResNeXt-101-FPN)
Object DetectionCOCO minivalbox AP36.7Mask R-CNN (ResNeXt-101-FPN)
Object DetectionCOCObox AP45.2Mask R-CNN X-152-32x8d
3DCOCO test-devAP5062.3Mask R-CNN (ResNeXt-101-FPN)
3DCOCO test-devAP7543.4Mask R-CNN (ResNeXt-101-FPN)
3DCOCO test-devAPL51.2Mask R-CNN (ResNeXt-101-FPN)
3DCOCO test-devAPM43.2Mask R-CNN (ResNeXt-101-FPN)
3DCOCO test-devAPS22.1Mask R-CNN (ResNeXt-101-FPN)
3DCOCO test-devbox mAP39.8Mask R-CNN (ResNeXt-101-FPN)
3DCOCO test-devAP5060.3Mask R-CNN (ResNet-101-FPN)
3DCOCO test-devAP7541.7Mask R-CNN (ResNet-101-FPN)
3DCOCO test-devAPL50.2Mask R-CNN (ResNet-101-FPN)
3DCOCO test-devAPM41.1Mask R-CNN (ResNet-101-FPN)
3DCOCO test-devAPS20.1Mask R-CNN (ResNet-101-FPN)
3DCOCO test-devbox mAP38.2Mask R-CNN (ResNet-101-FPN)
3DCOCO-OAverage mAP17.1Mask R-CNN (ResNet-50)
3DCOCO-OEffective Robustness-0.11Mask R-CNN (ResNet-50)
3DiSAIDAverage Precision37.18Mask-RCNN+
3DiSAIDAverage Precision36.5Mask-RCNN
3DCOCO minivalbox AP40Mask R-CNN (ResNet-101-FPN)
3DCOCO minivalbox AP37.7Mask R-CNN (ResNet-50-FPN)
3DCOCO minivalAP5059.5Mask R-CNN (ResNeXt-101-FPN)
3DCOCO minivalAP7538.9Mask R-CNN (ResNeXt-101-FPN)
3DCOCO minivalbox AP36.7Mask R-CNN (ResNeXt-101-FPN)
3DCOCObox AP45.2Mask R-CNN X-152-32x8d
3DCOCO test-devAP63.1Mask-RCNN
3DCOCO test-devAP5087.3Mask-RCNN
3DCOCO test-devAP7568.7Mask-RCNN
3DCOCO test-devAPL71.4Mask-RCNN
3DCOCOTest AP63.1Mask R-CNN
3DCOCOValidation AP69.2Mask R-CNN
3DCOCO test-devAP5087.3Mask R-CNN
3DCOCO test-devAP7568.7Mask R-CNN
3DCOCO test-devAPL71.4Mask R-CNN
3DCOCO test-devAPM57.8Mask R-CNN
3DCOCO test-challengeAP68.9Mask R-CNN*
3DCOCO test-challengeAP5089.2Mask R-CNN*
3DCOCO test-challengeAP7575.2Mask R-CNN*
3DCOCO test-challengeAPL82.6Mask R-CNN*
3DCOCO test-challengeAR75.4Mask R-CNN*
3DCOCO test-challengeAR5093.2Mask R-CNN*
3DCOCO test-challengeAR7581.2Mask R-CNN*
3DCOCO test-challengeARL76.8Mask R-CNN*
3DCOCO test-challengeARM70.2Mask R-CNN*
3DCrowdPoseAP Easy69.4Mask R-CNN
3DCrowdPoseAP Hard45.8Mask R-CNN
3DCrowdPoseAP Medium57.9Mask R-CNN
3DCrowdPosemAP @0.5:0.9557.2Mask R-CNN
3DOCHumanAP5033.2Mask R-CNN
3DOCHumanAP7524.5Mask R-CNN
3DOCHumanValidation AP20.2Mask R-CNN
Instance SegmentationiSAIDAverage Precision37.18Mask-RCNN+
Instance SegmentationiSAIDAverage Precision36.5Mask-RCNN
Instance SegmentationBDD100K valAP20.5Mask R-CNN
Instance SegmentationCOCO test-devAP5060Mask R-CNN (ResNeXt-101-FPN)
Instance SegmentationCOCO test-devAP7539.4Mask R-CNN (ResNeXt-101-FPN)
Instance SegmentationCOCO test-devAPL53.5Mask R-CNN (ResNeXt-101-FPN)
Instance SegmentationCOCO test-devAPM39.9Mask R-CNN (ResNeXt-101-FPN)
Instance SegmentationCOCO test-devAPS16.9Mask R-CNN (ResNeXt-101-FPN)
Instance SegmentationCOCO test-devmask AP37.1Mask R-CNN (ResNeXt-101-FPN)
Human ParsingMHP v2.0AP 0.514.9Mask R-CNN
Multi-tissue Nucleus SegmentationKumarDice0.76Mask R-CNN (e)
Multi-tissue Nucleus SegmentationKumarHausdorff Distance (mm)50.9Mask R-CNN (e)
Object SegmentationGRITSegmentation (ablation)26.2Mask R-CNN
Object SegmentationGRITSegmentation (test)26.2Mask R-CNN
2D ClassificationCOCO test-devAP5062.3Mask R-CNN (ResNeXt-101-FPN)
2D ClassificationCOCO test-devAP7543.4Mask R-CNN (ResNeXt-101-FPN)
2D ClassificationCOCO test-devAPL51.2Mask R-CNN (ResNeXt-101-FPN)
2D ClassificationCOCO test-devAPM43.2Mask R-CNN (ResNeXt-101-FPN)
2D ClassificationCOCO test-devAPS22.1Mask R-CNN (ResNeXt-101-FPN)
2D ClassificationCOCO test-devbox mAP39.8Mask R-CNN (ResNeXt-101-FPN)
2D ClassificationCOCO test-devAP5060.3Mask R-CNN (ResNet-101-FPN)
2D ClassificationCOCO test-devAP7541.7Mask R-CNN (ResNet-101-FPN)
2D ClassificationCOCO test-devAPL50.2Mask R-CNN (ResNet-101-FPN)
2D ClassificationCOCO test-devAPM41.1Mask R-CNN (ResNet-101-FPN)
2D ClassificationCOCO test-devAPS20.1Mask R-CNN (ResNet-101-FPN)
2D ClassificationCOCO test-devbox mAP38.2Mask R-CNN (ResNet-101-FPN)
2D ClassificationCOCO-OAverage mAP17.1Mask R-CNN (ResNet-50)
2D ClassificationCOCO-OEffective Robustness-0.11Mask R-CNN (ResNet-50)
2D ClassificationiSAIDAverage Precision37.18Mask-RCNN+
2D ClassificationiSAIDAverage Precision36.5Mask-RCNN
2D ClassificationCOCO minivalbox AP40Mask R-CNN (ResNet-101-FPN)
2D ClassificationCOCO minivalbox AP37.7Mask R-CNN (ResNet-50-FPN)
2D ClassificationCOCO minivalAP5059.5Mask R-CNN (ResNeXt-101-FPN)
2D ClassificationCOCO minivalAP7538.9Mask R-CNN (ResNeXt-101-FPN)
2D ClassificationCOCO minivalbox AP36.7Mask R-CNN (ResNeXt-101-FPN)
2D ClassificationCOCObox AP45.2Mask R-CNN X-152-32x8d
Multi-Person Pose EstimationCrowdPoseAP Easy69.4Mask R-CNN
Multi-Person Pose EstimationCrowdPoseAP Hard45.8Mask R-CNN
Multi-Person Pose EstimationCrowdPoseAP Medium57.9Mask R-CNN
Multi-Person Pose EstimationCrowdPosemAP @0.5:0.9557.2Mask R-CNN
Multi-Person Pose EstimationOCHumanAP5033.2Mask R-CNN
Multi-Person Pose EstimationOCHumanAP7524.5Mask R-CNN
Multi-Person Pose EstimationOCHumanValidation AP20.2Mask R-CNN
2D Object DetectionCOCO test-devAP5062.3Mask R-CNN (ResNeXt-101-FPN)
2D Object DetectionCOCO test-devAP7543.4Mask R-CNN (ResNeXt-101-FPN)
2D Object DetectionCOCO test-devAPL51.2Mask R-CNN (ResNeXt-101-FPN)
2D Object DetectionCOCO test-devAPM43.2Mask R-CNN (ResNeXt-101-FPN)
2D Object DetectionCOCO test-devAPS22.1Mask R-CNN (ResNeXt-101-FPN)
2D Object DetectionCOCO test-devbox mAP39.8Mask R-CNN (ResNeXt-101-FPN)
2D Object DetectionCOCO test-devAP5060.3Mask R-CNN (ResNet-101-FPN)
2D Object DetectionCOCO test-devAP7541.7Mask R-CNN (ResNet-101-FPN)
2D Object DetectionCOCO test-devAPL50.2Mask R-CNN (ResNet-101-FPN)
2D Object DetectionCOCO test-devAPM41.1Mask R-CNN (ResNet-101-FPN)
2D Object DetectionCOCO test-devAPS20.1Mask R-CNN (ResNet-101-FPN)
2D Object DetectionCOCO test-devbox mAP38.2Mask R-CNN (ResNet-101-FPN)
2D Object DetectionCOCO-OAverage mAP17.1Mask R-CNN (ResNet-50)
2D Object DetectionCOCO-OEffective Robustness-0.11Mask R-CNN (ResNet-50)
2D Object DetectioniSAIDAverage Precision37.18Mask-RCNN+
2D Object DetectioniSAIDAverage Precision36.5Mask-RCNN
2D Object DetectionCOCO minivalbox AP40Mask R-CNN (ResNet-101-FPN)
2D Object DetectionCOCO minivalbox AP37.7Mask R-CNN (ResNet-50-FPN)
2D Object DetectionCOCO minivalAP5059.5Mask R-CNN (ResNeXt-101-FPN)
2D Object DetectionCOCO minivalAP7538.9Mask R-CNN (ResNeXt-101-FPN)
2D Object DetectionCOCO minivalbox AP36.7Mask R-CNN (ResNeXt-101-FPN)
2D Object DetectionCOCObox AP45.2Mask R-CNN X-152-32x8d
10-shot image generationCityscapes valPQth54Mask R-CNN+COCO
Panoptic SegmentationCityscapes valPQth54Mask R-CNN+COCO
1 Image, 2*2 StitchiCOCO test-devAP63.1Mask-RCNN
1 Image, 2*2 StitchiCOCO test-devAP5087.3Mask-RCNN
1 Image, 2*2 StitchiCOCO test-devAP7568.7Mask-RCNN
1 Image, 2*2 StitchiCOCO test-devAPL71.4Mask-RCNN
1 Image, 2*2 StitchiCOCOTest AP63.1Mask R-CNN
1 Image, 2*2 StitchiCOCOValidation AP69.2Mask R-CNN
1 Image, 2*2 StitchiCOCO test-devAP5087.3Mask R-CNN
1 Image, 2*2 StitchiCOCO test-devAP7568.7Mask R-CNN
1 Image, 2*2 StitchiCOCO test-devAPL71.4Mask R-CNN
1 Image, 2*2 StitchiCOCO test-devAPM57.8Mask R-CNN
1 Image, 2*2 StitchiCOCO test-challengeAP68.9Mask R-CNN*
1 Image, 2*2 StitchiCOCO test-challengeAP5089.2Mask R-CNN*
1 Image, 2*2 StitchiCOCO test-challengeAP7575.2Mask R-CNN*
1 Image, 2*2 StitchiCOCO test-challengeAPL82.6Mask R-CNN*
1 Image, 2*2 StitchiCOCO test-challengeAR75.4Mask R-CNN*
1 Image, 2*2 StitchiCOCO test-challengeAR5093.2Mask R-CNN*
1 Image, 2*2 StitchiCOCO test-challengeAR7581.2Mask R-CNN*
1 Image, 2*2 StitchiCOCO test-challengeARL76.8Mask R-CNN*
1 Image, 2*2 StitchiCOCO test-challengeARM70.2Mask R-CNN*
1 Image, 2*2 StitchiCrowdPoseAP Easy69.4Mask R-CNN
1 Image, 2*2 StitchiCrowdPoseAP Hard45.8Mask R-CNN
1 Image, 2*2 StitchiCrowdPoseAP Medium57.9Mask R-CNN
1 Image, 2*2 StitchiCrowdPosemAP @0.5:0.9557.2Mask R-CNN
1 Image, 2*2 StitchiOCHumanAP5033.2Mask R-CNN
1 Image, 2*2 StitchiOCHumanAP7524.5Mask R-CNN
1 Image, 2*2 StitchiOCHumanValidation AP20.2Mask R-CNN
16kCOCO test-devAP5062.3Mask R-CNN (ResNeXt-101-FPN)
16kCOCO test-devAP7543.4Mask R-CNN (ResNeXt-101-FPN)
16kCOCO test-devAPL51.2Mask R-CNN (ResNeXt-101-FPN)
16kCOCO test-devAPM43.2Mask R-CNN (ResNeXt-101-FPN)
16kCOCO test-devAPS22.1Mask R-CNN (ResNeXt-101-FPN)
16kCOCO test-devbox mAP39.8Mask R-CNN (ResNeXt-101-FPN)
16kCOCO test-devAP5060.3Mask R-CNN (ResNet-101-FPN)
16kCOCO test-devAP7541.7Mask R-CNN (ResNet-101-FPN)
16kCOCO test-devAPL50.2Mask R-CNN (ResNet-101-FPN)
16kCOCO test-devAPM41.1Mask R-CNN (ResNet-101-FPN)
16kCOCO test-devAPS20.1Mask R-CNN (ResNet-101-FPN)
16kCOCO test-devbox mAP38.2Mask R-CNN (ResNet-101-FPN)
16kCOCO-OAverage mAP17.1Mask R-CNN (ResNet-50)
16kCOCO-OEffective Robustness-0.11Mask R-CNN (ResNet-50)
16kiSAIDAverage Precision37.18Mask-RCNN+
16kiSAIDAverage Precision36.5Mask-RCNN
16kCOCO minivalbox AP40Mask R-CNN (ResNet-101-FPN)
16kCOCO minivalbox AP37.7Mask R-CNN (ResNet-50-FPN)
16kCOCO minivalAP5059.5Mask R-CNN (ResNeXt-101-FPN)
16kCOCO minivalAP7538.9Mask R-CNN (ResNeXt-101-FPN)
16kCOCO minivalbox AP36.7Mask R-CNN (ResNeXt-101-FPN)
16kCOCObox AP45.2Mask R-CNN X-152-32x8d

Related Papers

SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction2025-07-21Deep Learning-Based Fetal Lung Segmentation from Diffusion-weighted MRI Images and Lung Maturity Evaluation for Fetal Growth Restriction2025-07-17DiffOSeg: Omni Medical Image Segmentation via Multi-Expert Collaboration Diffusion Model2025-07-17From Variability To Accuracy: Conditional Bernoulli Diffusion Models with Consensus-Driven Correction for Thin Structure Segmentation2025-07-17Unleashing Vision Foundation Models for Coronary Artery Segmentation: Parallel ViT-CNN Encoding and Variational Fusion2025-07-17SCORE: Scene Context Matters in Open-Vocabulary Remote Sensing Instance Segmentation2025-07-17Unified Medical Image Segmentation with State Space Modeling Snake2025-07-17A Privacy-Preserving Semantic-Segmentation Method Using Domain-Adaptation Technique2025-07-17