TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Light-Head R-CNN: In Defense of Two-Stage Object Detector

Light-Head R-CNN: In Defense of Two-Stage Object Detector

Zeming Li, Chao Peng, Gang Yu, Xiangyu Zhang, Yangdong Deng, Jian Sun

2017-11-20Vocal Bursts Valence Prediction
PaperPDFCodeCodeCodeCodeCode

Abstract

In this paper, we first investigate why typical two-stage methods are not as fast as single-stage, fast detectors like YOLO and SSD. We find that Faster R-CNN and R-FCN perform an intensive computation after or before RoI warping. Faster R-CNN involves two fully connected layers for RoI recognition, while R-FCN produces a large score maps. Thus, the speed of these networks is slow due to the heavy-head design in the architecture. Even if we significantly reduce the base model, the computation cost cannot be largely decreased accordingly. We propose a new two-stage detector, Light-Head R-CNN, to address the shortcoming in current two-stage approaches. In our design, we make the head of network as light as possible, by using a thin feature map and a cheap R-CNN subnet (pooling and single fully-connected layer). Our ResNet-101 based light-head R-CNN outperforms state-of-art object detectors on COCO while keeping time efficiency. More importantly, simply replacing the backbone with a tiny network (e.g, Xception), our Light-Head R-CNN gets 30.7 mmAP at 102 FPS on COCO, significantly outperforming the single-stage, fast detectors like YOLO and SSD on both speed and accuracy. Code will be made publicly available.

Related Papers

Black-box Source-free Domain Adaptation via Two-stage Knowledge Distillation2023-05-13Quantile-Based Deep Reinforcement Learning using Two-Timescale Policy Gradient Algorithms2023-05-12Provable Identifiability of Two-Layer ReLU Neural Networks via LASSO Regularization2023-05-07Two-Dimensional Channel Parameter Estimation for IRS-Assisted Networks2023-05-07Two to Five Truths in Non-Negative Matrix Factorization2023-05-06Evaluating Variants of wav2vec 2.0 on Affective Vocal Burst Tasks2023-05-05VideoOFA: Two-Stage Pre-Training for Video-to-Text Generation2023-05-04Debiased Inference for Dynamic Nonlinear Panels with Multi-dimensional Heterogeneities2023-05-04