TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Methods/Mask R-CNN

Mask R-CNN

Computer VisionIntroduced 2000420 papers
Source Paper

Description

Mask R-CNN extends Faster R-CNN to solve instance segmentation tasks. It achieves this by adding a branch for predicting an object mask in parallel with the existing branch for bounding box recognition. In principle, Mask R-CNN is an intuitive extension of Faster R-CNN, but constructing the mask branch properly is critical for good results.

Most importantly, Faster R-CNN was not designed for pixel-to-pixel alignment between network inputs and outputs. This is evident in how RoIPool, the de facto core operation for attending to instances, performs coarse spatial quantization for feature extraction. To fix the misalignment, Mask R-CNN utilises a simple, quantization-free layer, called RoIAlign, that faithfully preserves exact spatial locations.

Secondly, Mask R-CNN decouples mask and class prediction: it predicts a binary mask for each class independently, without competition among classes, and relies on the network's RoI classification branch to predict the category. In contrast, an FCN usually perform per-pixel multi-class categorization, which couples segmentation and classification.

Papers Using This Method

A novel visual data-based diagnostic approach for estimation of regime transition in pool boiling2025-06-12Bringing SAM to new heights: Leveraging elevation data for tree crown segmentation from drone imagery2025-06-05Detailed Evaluation of Modern Machine Learning Approaches for Optic Plastics Sorting2025-05-22SurgPose: Generalisable Surgical Instrument Pose Estimation using Zero-Shot Learning and Stereo Vision2025-05-16A Robust Deep Networks based Multi-Object MultiCamera Tracking System for City Scale Traffic2025-05-01Transcending Dimensions using Generative AI: Real-Time 3D Model Generation in Augmented Reality2025-04-27Real-time Seafloor Segmentation and Mapping2025-04-14RipVIS: Rip Currents Video Instance Segmentation Benchmark for Beach Monitoring and Safety2025-04-01AI-Assisted Colonoscopy: Polyp Detection and Segmentation using Foundation Models2025-03-31Assessing SAM for Tree Crown Instance Segmentation from Drone Imagery2025-03-26OverLoCK: An Overview-first-Look-Closely-next ConvNet with Context-Mixing Dynamic Kernels2025-02-27SASVi - Segment Any Surgical Video2025-02-12Adaptive Object Detection for Indoor Navigation Assistance: A Performance Evaluation of Real-Time Algorithms2025-01-30Transfer Learning for Keypoint Detection in Low-Resolution Thermal TUG Test Images2025-01-30Effective Defect Detection Using Instance Segmentation for NDI2025-01-24Data-driven Detection and Evaluation of Damages in Concrete Structures: Using Deep Learning and Computer Vision2025-01-21Rapid Automated Mapping of Clouds on Titan With Instance Segmentation2025-01-08AI-Powered Cow Detection in Complex Farm Environments2025-01-03Leveraging Deep Learning with Multi-Head Attention for Accurate Extraction of Medicine from Handwritten Prescriptions2024-12-24Exploring Machine Learning Engineering for Object Detection and Tracking by Unmanned Aerial Vehicle (UAV)2024-12-19