TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/CBNet: A Composite Backbone Network Architecture for Objec...

CBNet: A Composite Backbone Network Architecture for Object Detection

TingTing Liang, Xiaojie Chu, Yudong Liu, Yongtao Wang, Zhi Tang, Wei Chu, Jingdong Chen, Haibin Ling

2021-07-01Real-Time Object DetectionInstance Segmentationobject-detectionObject Detection
PaperPDFCodeCode(official)CodeCode

Abstract

Modern top-performing object detectors depend heavily on backbone networks, whose advances bring consistent performance gains through exploring more effective network structures. In this paper, we propose a novel and flexible backbone framework, namely CBNetV2, to construct high-performance detectors using existing open-sourced pre-trained backbones under the pre-training fine-tuning paradigm. In particular, CBNetV2 architecture groups multiple identical backbones, which are connected through composite connections. Specifically, it integrates the high- and low-level features of multiple backbone networks and gradually expands the receptive field to more efficiently perform object detection. We also propose a better training strategy with assistant supervision for CBNet-based detectors. Without additional pre-training of the composite backbone, CBNetV2 can be adapted to various backbones (CNN-based vs. Transformer-based) and head designs of most mainstream detectors (one-stage vs. two-stage, anchor-based vs. anchor-free-based). Experiments provide strong evidence that, compared with simply increasing the depth and width of the network, CBNetV2 introduces a more efficient, effective, and resource-friendly way to build high-performance backbone networks. Particularly, our Dual-Swin-L achieves 59.4% box AP and 51.6% mask AP on COCO test-dev under the single-model and single-scale testing protocol, which is significantly better than the state-of-the-art result (57.7% box AP and 50.2% mask AP) achieved by Swin-L, while the training schedule is reduced by 6$\times$. With multi-scale testing, we push the current best single model result to a new record of 60.1% box AP and 52.3% mask AP without using extra training data. Code is available at https://github.com/VDIGPKU/CBNetV2.

Results

TaskDatasetMetricValueModel
Object DetectionCOCO test-devbox mAP60.1CBNetV2 (Dual-Swin-L HTC, multi-scale)
Object DetectionCOCO test-devbox mAP59.4CBNetV2 (Dual-Swin-L HTC, single-scale)
Object DetectionCOCO-OAverage mAP39CBNetV2 (Swin-L)
Object DetectionCOCO-OEffective Robustness12.36CBNetV2 (Swin-L)
Object DetectionCOCO minivalbox AP59.6CBNetV2 (Dual-Swin-L HTC, multi-scale)
Object DetectionCOCO minivalbox AP59.1CBNetV2 (Dual-Swin-L HTC, multi-scale)
3DCOCO test-devbox mAP60.1CBNetV2 (Dual-Swin-L HTC, multi-scale)
3DCOCO test-devbox mAP59.4CBNetV2 (Dual-Swin-L HTC, single-scale)
3DCOCO-OAverage mAP39CBNetV2 (Swin-L)
3DCOCO-OEffective Robustness12.36CBNetV2 (Swin-L)
3DCOCO minivalbox AP59.6CBNetV2 (Dual-Swin-L HTC, multi-scale)
3DCOCO minivalbox AP59.1CBNetV2 (Dual-Swin-L HTC, multi-scale)
Instance SegmentationCOCO minivalmask AP51.8CBNetV2 (Dual-Swin-L HTC, multi-scale)
Instance SegmentationCOCO minivalmask AP51CBNetV2 (Dual-Swin-L HTC, multi-scale)
Instance SegmentationCOCO test-devAP5080.3CBNetV2 (EVA02, single-scale)
Instance SegmentationCOCO test-devAP7562.1CBNetV2 (EVA02, single-scale)
Instance SegmentationCOCO test-devAPL70.9CBNetV2 (EVA02, single-scale)
Instance SegmentationCOCO test-devAPM59.3CBNetV2 (EVA02, single-scale)
Instance SegmentationCOCO test-devAPS39.7CBNetV2 (EVA02, single-scale)
Instance SegmentationCOCO test-devmask AP56.1CBNetV2 (EVA02, single-scale)
Instance SegmentationCOCO test-devmask AP52.3CBNetV2 (Dual-Swin-L HTC, multi-scale)
Instance SegmentationCOCO test-devmask AP51.6CBNetV2 (Dual-Swin-L HTC, single-scale)
2D ClassificationCOCO test-devbox mAP60.1CBNetV2 (Dual-Swin-L HTC, multi-scale)
2D ClassificationCOCO test-devbox mAP59.4CBNetV2 (Dual-Swin-L HTC, single-scale)
2D ClassificationCOCO-OAverage mAP39CBNetV2 (Swin-L)
2D ClassificationCOCO-OEffective Robustness12.36CBNetV2 (Swin-L)
2D ClassificationCOCO minivalbox AP59.6CBNetV2 (Dual-Swin-L HTC, multi-scale)
2D ClassificationCOCO minivalbox AP59.1CBNetV2 (Dual-Swin-L HTC, multi-scale)
2D Object DetectionCOCO test-devbox mAP60.1CBNetV2 (Dual-Swin-L HTC, multi-scale)
2D Object DetectionCOCO test-devbox mAP59.4CBNetV2 (Dual-Swin-L HTC, single-scale)
2D Object DetectionCOCO-OAverage mAP39CBNetV2 (Swin-L)
2D Object DetectionCOCO-OEffective Robustness12.36CBNetV2 (Swin-L)
2D Object DetectionCOCO minivalbox AP59.6CBNetV2 (Dual-Swin-L HTC, multi-scale)
2D Object DetectionCOCO minivalbox AP59.1CBNetV2 (Dual-Swin-L HTC, multi-scale)
16kCOCO test-devbox mAP60.1CBNetV2 (Dual-Swin-L HTC, multi-scale)
16kCOCO test-devbox mAP59.4CBNetV2 (Dual-Swin-L HTC, single-scale)
16kCOCO-OAverage mAP39CBNetV2 (Swin-L)
16kCOCO-OEffective Robustness12.36CBNetV2 (Swin-L)
16kCOCO minivalbox AP59.6CBNetV2 (Dual-Swin-L HTC, multi-scale)
16kCOCO minivalbox AP59.1CBNetV2 (Dual-Swin-L HTC, multi-scale)

Related Papers

SCORE: Scene Context Matters in Open-Vocabulary Remote Sensing Instance Segmentation2025-07-17A Real-Time System for Egocentric Hand-Object Interaction Detection in Industrial Domains2025-07-17RS-TinyNet: Stage-wise Feature Fusion Network for Detecting Tiny Objects in Remote Sensing Images2025-07-17Decoupled PROB: Decoupled Query Initialization Tasks and Objectness-Class Learning for Open World Object Detection2025-07-17Dual LiDAR-Based Traffic Movement Count Estimation at a Signalized Intersection: Deployment, Data Collection, and Preliminary Analysis2025-07-17Vision-based Perception for Autonomous Vehicles in Obstacle Avoidance Scenarios2025-07-16Tomato Multi-Angle Multi-Pose Dataset for Fine-Grained Phenotyping2025-07-15Beyond Appearance: Geometric Cues for Robust Video Instance Segmentation2025-07-08