TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Revisiting the Sibling Head in Object Detector

Revisiting the Sibling Head in Object Detector

Guanglu Song, Yu Liu, Xiaogang Wang

2020-03-17CVPR 2020 6regressionDisentanglementGeneral Classificationobject-detectionObject Detection
PaperPDFCodeCode

Abstract

The ``shared head for classification and localization'' (sibling head), firstly denominated in Fast RCNN~\cite{girshick2015fast}, has been leading the fashion of the object detection community in the past five years. This paper provides the observation that the spatial misalignment between the two object functions in the sibling head can considerably hurt the training process, but this misalignment can be resolved by a very simple operator called task-aware spatial disentanglement (TSD). Considering the classification and regression, TSD decouples them from the spatial dimension by generating two disentangled proposals for them, which are estimated by the shared proposal. This is inspired by the natural insight that for one instance, the features in some salient area may have rich information for classification while these around the boundary may be good at bounding box regression. Surprisingly, this simple design can boost all backbones and models on both MS COCO and Google OpenImage consistently by ~3% mAP. Further, we propose a progressive constraint to enlarge the performance margin between the disentangled and the shared proposals, and gain ~1% more mAP. We show the \algname{} breaks through the upper bound of nowadays single-model detector by a large margin (mAP 49.4 with ResNet-101, 51.2 with SENet154), and is the core model of our 1st place solution on the Google OpenImage Challenge 2019.

Results

TaskDatasetMetricValueModel
Object DetectionCOCO test-devAP5071.9TSD(SENet154-DCN,multi-scale)
Object DetectionCOCO test-devAP7556TSD(SENet154-DCN,multi-scale)
Object DetectionCOCO test-devAPL64.2TSD(SENet154-DCN,multi-scale)
Object DetectionCOCO test-devAPM54.8TSD(SENet154-DCN,multi-scale)
Object DetectionCOCO test-devAPS33.8TSD(SENet154-DCN,multi-scale)
Object DetectionCOCO test-devbox mAP51.2TSD(SENet154-DCN,multi-scale)
Object DetectionCOCO test-devAP5069.6TSD(ResNet-101-Deformable, Image Pyramid)
Object DetectionCOCO test-devAP7554.4TSD(ResNet-101-Deformable, Image Pyramid)
Object DetectionCOCO test-devAPL61TSD(ResNet-101-Deformable, Image Pyramid)
Object DetectionCOCO test-devAPM52.5TSD(ResNet-101-Deformable, Image Pyramid)
Object DetectionCOCO test-devAPS32.7TSD(ResNet-101-Deformable, Image Pyramid)
Object DetectionCOCO test-devbox mAP49.4TSD(ResNet-101-Deformable, Image Pyramid)
3DCOCO test-devAP5071.9TSD(SENet154-DCN,multi-scale)
3DCOCO test-devAP7556TSD(SENet154-DCN,multi-scale)
3DCOCO test-devAPL64.2TSD(SENet154-DCN,multi-scale)
3DCOCO test-devAPM54.8TSD(SENet154-DCN,multi-scale)
3DCOCO test-devAPS33.8TSD(SENet154-DCN,multi-scale)
3DCOCO test-devbox mAP51.2TSD(SENet154-DCN,multi-scale)
3DCOCO test-devAP5069.6TSD(ResNet-101-Deformable, Image Pyramid)
3DCOCO test-devAP7554.4TSD(ResNet-101-Deformable, Image Pyramid)
3DCOCO test-devAPL61TSD(ResNet-101-Deformable, Image Pyramid)
3DCOCO test-devAPM52.5TSD(ResNet-101-Deformable, Image Pyramid)
3DCOCO test-devAPS32.7TSD(ResNet-101-Deformable, Image Pyramid)
3DCOCO test-devbox mAP49.4TSD(ResNet-101-Deformable, Image Pyramid)
2D ClassificationCOCO test-devAP5071.9TSD(SENet154-DCN,multi-scale)
2D ClassificationCOCO test-devAP7556TSD(SENet154-DCN,multi-scale)
2D ClassificationCOCO test-devAPL64.2TSD(SENet154-DCN,multi-scale)
2D ClassificationCOCO test-devAPM54.8TSD(SENet154-DCN,multi-scale)
2D ClassificationCOCO test-devAPS33.8TSD(SENet154-DCN,multi-scale)
2D ClassificationCOCO test-devbox mAP51.2TSD(SENet154-DCN,multi-scale)
2D ClassificationCOCO test-devAP5069.6TSD(ResNet-101-Deformable, Image Pyramid)
2D ClassificationCOCO test-devAP7554.4TSD(ResNet-101-Deformable, Image Pyramid)
2D ClassificationCOCO test-devAPL61TSD(ResNet-101-Deformable, Image Pyramid)
2D ClassificationCOCO test-devAPM52.5TSD(ResNet-101-Deformable, Image Pyramid)
2D ClassificationCOCO test-devAPS32.7TSD(ResNet-101-Deformable, Image Pyramid)
2D ClassificationCOCO test-devbox mAP49.4TSD(ResNet-101-Deformable, Image Pyramid)
2D Object DetectionCOCO test-devAP5071.9TSD(SENet154-DCN,multi-scale)
2D Object DetectionCOCO test-devAP7556TSD(SENet154-DCN,multi-scale)
2D Object DetectionCOCO test-devAPL64.2TSD(SENet154-DCN,multi-scale)
2D Object DetectionCOCO test-devAPM54.8TSD(SENet154-DCN,multi-scale)
2D Object DetectionCOCO test-devAPS33.8TSD(SENet154-DCN,multi-scale)
2D Object DetectionCOCO test-devbox mAP51.2TSD(SENet154-DCN,multi-scale)
2D Object DetectionCOCO test-devAP5069.6TSD(ResNet-101-Deformable, Image Pyramid)
2D Object DetectionCOCO test-devAP7554.4TSD(ResNet-101-Deformable, Image Pyramid)
2D Object DetectionCOCO test-devAPL61TSD(ResNet-101-Deformable, Image Pyramid)
2D Object DetectionCOCO test-devAPM52.5TSD(ResNet-101-Deformable, Image Pyramid)
2D Object DetectionCOCO test-devAPS32.7TSD(ResNet-101-Deformable, Image Pyramid)
2D Object DetectionCOCO test-devbox mAP49.4TSD(ResNet-101-Deformable, Image Pyramid)
16kCOCO test-devAP5071.9TSD(SENet154-DCN,multi-scale)
16kCOCO test-devAP7556TSD(SENet154-DCN,multi-scale)
16kCOCO test-devAPL64.2TSD(SENet154-DCN,multi-scale)
16kCOCO test-devAPM54.8TSD(SENet154-DCN,multi-scale)
16kCOCO test-devAPS33.8TSD(SENet154-DCN,multi-scale)
16kCOCO test-devbox mAP51.2TSD(SENet154-DCN,multi-scale)
16kCOCO test-devAP5069.6TSD(ResNet-101-Deformable, Image Pyramid)
16kCOCO test-devAP7554.4TSD(ResNet-101-Deformable, Image Pyramid)
16kCOCO test-devAPL61TSD(ResNet-101-Deformable, Image Pyramid)
16kCOCO test-devAPM52.5TSD(ResNet-101-Deformable, Image Pyramid)
16kCOCO test-devAPS32.7TSD(ResNet-101-Deformable, Image Pyramid)
16kCOCO test-devbox mAP49.4TSD(ResNet-101-Deformable, Image Pyramid)

Related Papers

Language Integration in Fine-Tuning Multimodal Large Language Models for Image-Based Regression2025-07-20CSD-VAR: Content-Style Decomposition in Visual Autoregressive Models2025-07-18A Real-Time System for Egocentric Hand-Object Interaction Detection in Industrial Domains2025-07-17RS-TinyNet: Stage-wise Feature Fusion Network for Detecting Tiny Objects in Remote Sensing Images2025-07-17Decoupled PROB: Decoupled Query Initialization Tasks and Objectness-Class Learning for Open World Object Detection2025-07-17Dual LiDAR-Based Traffic Movement Count Estimation at a Signalized Intersection: Deployment, Data Collection, and Preliminary Analysis2025-07-17Neural Network-Guided Symbolic Regression for Interpretable Descriptor Discovery in Perovskite Catalysts2025-07-16Imbalanced Regression Pipeline Recommendation2025-07-16