TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Forest R-CNN: Large-Vocabulary Long-Tailed Object Detectio...

Forest R-CNN: Large-Vocabulary Long-Tailed Object Detection and Instance Segmentation

Jialian Wu, Liangchen Song, Tiancai Wang, Qian Zhang, Junsong Yuan

2020-08-13Few-Shot Object DetectionObject RecognitionLong-tailed Object DetectionSemantic SegmentationInstance SegmentationGeneral ClassificationClassificationobject-detectionObject Detection
PaperPDFCode(official)

Abstract

Despite the previous success of object analysis, detecting and segmenting a large number of object categories with a long-tailed data distribution remains a challenging problem and is less investigated. For a large-vocabulary classifier, the chance of obtaining noisy logits is much higher, which can easily lead to a wrong recognition. In this paper, we exploit prior knowledge of the relations among object categories to cluster fine-grained classes into coarser parent classes, and construct a classification tree that is responsible for parsing an object instance into a fine-grained category via its parent class. In the classification tree, as the number of parent class nodes are significantly less, their logits are less noisy and can be utilized to suppress the wrong/noisy logits existed in the fine-grained class nodes. As the way to construct the parent class is not unique, we further build multiple trees to form a classification forest where each tree contributes its vote to the fine-grained classification. To alleviate the imbalanced learning caused by the long-tail phenomena, we propose a simple yet effective resampling method, NMS Resampling, to re-balance the data distribution. Our method, termed as Forest R-CNN, can serve as a plug-and-play module being applied to most object recognition models for recognizing more than 1000 categories. Extensive experiments are performed on the large vocabulary dataset LVIS. Compared with the Mask R-CNN baseline, the Forest R-CNN significantly boosts the performance with 11.5% and 3.9% AP improvements on the rare categories and overall categories, respectively. Moreover, we achieve state-of-the-art results on the LVIS dataset. Code is available at https://github.com/JialianW/Forest_RCNN.

Results

TaskDatasetMetricValueModel
Object DetectionLVIS v1.0 valAP23.2Forest R-CNN
Object DetectionLVIS v1.0 valAPc22.7Forest R-CNN
Object DetectionLVIS v1.0 valAPf27.7Forest R-CNN
Object DetectionLVIS v1.0 valAPr14.2Forest R-CNN
3DLVIS v1.0 valAP23.2Forest R-CNN
3DLVIS v1.0 valAPc22.7Forest R-CNN
3DLVIS v1.0 valAPf27.7Forest R-CNN
3DLVIS v1.0 valAPr14.2Forest R-CNN
Few-Shot Object DetectionLVIS v1.0 valAP23.2Forest R-CNN
Few-Shot Object DetectionLVIS v1.0 valAPc22.7Forest R-CNN
Few-Shot Object DetectionLVIS v1.0 valAPf27.7Forest R-CNN
Few-Shot Object DetectionLVIS v1.0 valAPr14.2Forest R-CNN
2D ClassificationLVIS v1.0 valAP23.2Forest R-CNN
2D ClassificationLVIS v1.0 valAPc22.7Forest R-CNN
2D ClassificationLVIS v1.0 valAPf27.7Forest R-CNN
2D ClassificationLVIS v1.0 valAPr14.2Forest R-CNN
2D Object DetectionLVIS v1.0 valAP23.2Forest R-CNN
2D Object DetectionLVIS v1.0 valAPc22.7Forest R-CNN
2D Object DetectionLVIS v1.0 valAPf27.7Forest R-CNN
2D Object DetectionLVIS v1.0 valAPr14.2Forest R-CNN
16kLVIS v1.0 valAP23.2Forest R-CNN
16kLVIS v1.0 valAPc22.7Forest R-CNN
16kLVIS v1.0 valAPf27.7Forest R-CNN
16kLVIS v1.0 valAPr14.2Forest R-CNN

Related Papers

SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction2025-07-21DiffOSeg: Omni Medical Image Segmentation via Multi-Expert Collaboration Diffusion Model2025-07-17SCORE: Scene Context Matters in Open-Vocabulary Remote Sensing Instance Segmentation2025-07-17Unified Medical Image Segmentation with State Space Modeling Snake2025-07-17A Privacy-Preserving Semantic-Segmentation Method Using Domain-Adaptation Technique2025-07-17Adversarial attacks to image classification systems using evolutionary algorithms2025-07-17A Real-Time System for Egocentric Hand-Object Interaction Detection in Industrial Domains2025-07-17RS-TinyNet: Stage-wise Feature Fusion Network for Detecting Tiny Objects in Remote Sensing Images2025-07-17