TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Focal Loss for Dense Object Detection

Focal Loss for Dense Object Detection

Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, Piotr Dollár

2017-08-07ICCV 2017 10Dense Object DetectionLong-tail LearningRegion ProposalReal-Time Object DetectionPedestrian Detection2D Object DetectionKnowledge DistillationObject Detection
PaperPDFCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCode(official)CodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCode

Abstract

The highest accuracy object detectors to date are based on a two-stage approach popularized by R-CNN, where a classifier is applied to a sparse set of candidate object locations. In contrast, one-stage detectors that are applied over a regular, dense sampling of possible object locations have the potential to be faster and simpler, but have trailed the accuracy of two-stage detectors thus far. In this paper, we investigate why this is the case. We discover that the extreme foreground-background class imbalance encountered during training of dense detectors is the central cause. We propose to address this class imbalance by reshaping the standard cross entropy loss such that it down-weights the loss assigned to well-classified examples. Our novel Focal Loss focuses training on a sparse set of hard examples and prevents the vast number of easy negatives from overwhelming the detector during training. To evaluate the effectiveness of our loss, we design and train a simple dense detector we call RetinaNet. Our results show that when trained with the focal loss, RetinaNet is able to match the speed of previous one-stage detectors while surpassing the accuracy of all existing state-of-the-art two-stage detectors. Code is at: https://github.com/facebookresearch/Detectron.

Results

TaskDatasetMetricValueModel
Facial Recognition and ModellingTrillion Pairs DatasetAccuracy37.14F-Softmax
Facial Recognition and ModellingTrillion Pairs DatasetAccuracy39.8F-Softmax
Autonomous VehiclesTJU-Ped-trafficALL (miss rate)41.4RetinaNet
Autonomous VehiclesTJU-Ped-trafficHO (miss rate)61.6RetinaNet
Autonomous VehiclesTJU-Ped-trafficR (miss rate)23.89RetinaNet
Autonomous VehiclesTJU-Ped-trafficR+HO (miss rate)28.45RetinaNet
Autonomous VehiclesTJU-Ped-trafficRS (miss rate)37.92RetinaNet
Autonomous VehiclesTJU-Ped-campusALL (miss rate)44.34RetinaNet
Autonomous VehiclesTJU-Ped-campusHO (miss rate)71.31RetinaNet
Autonomous VehiclesTJU-Ped-campusR (miss rate)34.73RetinaNet
Autonomous VehiclesTJU-Ped-campusR+HO (miss rate)42.26RetinaNet
Autonomous VehiclesTJU-Ped-campusRS (miss rate)82.99RetinaNet
Object CountingCARPKMAE24.58RetinaNet (2018)
Object DetectionCOCO test-devAP5061.1RetinaNet (ResNeXt-101-FPN)
Object DetectionCOCO test-devAP7544.1RetinaNet (ResNeXt-101-FPN)
Object DetectionCOCO test-devAPL51.2RetinaNet (ResNeXt-101-FPN)
Object DetectionCOCO test-devAPM44.2RetinaNet (ResNeXt-101-FPN)
Object DetectionCOCO test-devAPS24.1RetinaNet (ResNeXt-101-FPN)
Object DetectionCOCO test-devbox mAP40.8RetinaNet (ResNeXt-101-FPN)
Object DetectionCOCO test-devAP5059.1RetinaNet (ResNet-101-FPN)
Object DetectionCOCO test-devAP7542.3RetinaNet (ResNet-101-FPN)
Object DetectionCOCO test-devAPL50.2RetinaNet (ResNet-101-FPN)
Object DetectionCOCO test-devAPM42.7RetinaNet (ResNet-101-FPN)
Object DetectionCOCO test-devAPS21.8RetinaNet (ResNet-101-FPN)
Object DetectionCOCO test-devbox mAP39.1RetinaNet (ResNet-101-FPN)
Object DetectionCOCO-OAverage mAP16.6RetinaNet (ResNet-50)
Object DetectionCOCO-OEffective Robustness0.18RetinaNet (ResNet-50)
Object DetectionSKU-110KAP45.5RetinaNet
Object DetectionSKU-110KAP750.389RetinaNet
Image ClassificationCOCO-MLTAverage mAP49.46Focal Loss(ResNet-50)
Image ClassificationVOC-MLTAverage mAP73.88Focal Loss(ResNet-50)
Image ClassificationEGTEAAverage Precision59.09Focal loss (3D- ResNeXt101)
Image ClassificationEGTEAAverage Recall59.17Focal loss (3D- ResNeXt101)
Face VerificationTrillion Pairs DatasetAccuracy37.14F-Softmax
Face ReconstructionTrillion Pairs DatasetAccuracy37.14F-Softmax
Face ReconstructionTrillion Pairs DatasetAccuracy39.8F-Softmax
3DCOCO test-devAP5061.1RetinaNet (ResNeXt-101-FPN)
3DCOCO test-devAP7544.1RetinaNet (ResNeXt-101-FPN)
3DCOCO test-devAPL51.2RetinaNet (ResNeXt-101-FPN)
3DCOCO test-devAPM44.2RetinaNet (ResNeXt-101-FPN)
3DCOCO test-devAPS24.1RetinaNet (ResNeXt-101-FPN)
3DCOCO test-devbox mAP40.8RetinaNet (ResNeXt-101-FPN)
3DCOCO test-devAP5059.1RetinaNet (ResNet-101-FPN)
3DCOCO test-devAP7542.3RetinaNet (ResNet-101-FPN)
3DCOCO test-devAPL50.2RetinaNet (ResNet-101-FPN)
3DCOCO test-devAPM42.7RetinaNet (ResNet-101-FPN)
3DCOCO test-devAPS21.8RetinaNet (ResNet-101-FPN)
3DCOCO test-devbox mAP39.1RetinaNet (ResNet-101-FPN)
3DCOCO-OAverage mAP16.6RetinaNet (ResNet-50)
3DCOCO-OEffective Robustness0.18RetinaNet (ResNet-50)
3DSKU-110KAP45.5RetinaNet
3DSKU-110KAP750.389RetinaNet
3DTrillion Pairs DatasetAccuracy37.14F-Softmax
3DTrillion Pairs DatasetAccuracy39.8F-Softmax
3D Face ModellingTrillion Pairs DatasetAccuracy37.14F-Softmax
3D Face ModellingTrillion Pairs DatasetAccuracy39.8F-Softmax
Few-Shot Image ClassificationCOCO-MLTAverage mAP49.46Focal Loss(ResNet-50)
Few-Shot Image ClassificationVOC-MLTAverage mAP73.88Focal Loss(ResNet-50)
Few-Shot Image ClassificationEGTEAAverage Precision59.09Focal loss (3D- ResNeXt101)
Few-Shot Image ClassificationEGTEAAverage Recall59.17Focal loss (3D- ResNeXt101)
3D Face ReconstructionTrillion Pairs DatasetAccuracy37.14F-Softmax
3D Face ReconstructionTrillion Pairs DatasetAccuracy39.8F-Softmax
Generalized Few-Shot ClassificationCOCO-MLTAverage mAP49.46Focal Loss(ResNet-50)
Generalized Few-Shot ClassificationVOC-MLTAverage mAP73.88Focal Loss(ResNet-50)
Generalized Few-Shot ClassificationEGTEAAverage Precision59.09Focal loss (3D- ResNeXt101)
Generalized Few-Shot ClassificationEGTEAAverage Recall59.17Focal loss (3D- ResNeXt101)
Long-tail LearningCOCO-MLTAverage mAP49.46Focal Loss(ResNet-50)
Long-tail LearningVOC-MLTAverage mAP73.88Focal Loss(ResNet-50)
Long-tail LearningEGTEAAverage Precision59.09Focal loss (3D- ResNeXt101)
Long-tail LearningEGTEAAverage Recall59.17Focal loss (3D- ResNeXt101)
Generalized Few-Shot LearningCOCO-MLTAverage mAP49.46Focal Loss(ResNet-50)
Generalized Few-Shot LearningVOC-MLTAverage mAP73.88Focal Loss(ResNet-50)
Generalized Few-Shot LearningEGTEAAverage Precision59.09Focal loss (3D- ResNeXt101)
Generalized Few-Shot LearningEGTEAAverage Recall59.17Focal loss (3D- ResNeXt101)
2D ClassificationCOCO test-devAP5061.1RetinaNet (ResNeXt-101-FPN)
2D ClassificationCOCO test-devAP7544.1RetinaNet (ResNeXt-101-FPN)
2D ClassificationCOCO test-devAPL51.2RetinaNet (ResNeXt-101-FPN)
2D ClassificationCOCO test-devAPM44.2RetinaNet (ResNeXt-101-FPN)
2D ClassificationCOCO test-devAPS24.1RetinaNet (ResNeXt-101-FPN)
2D ClassificationCOCO test-devbox mAP40.8RetinaNet (ResNeXt-101-FPN)
2D ClassificationCOCO test-devAP5059.1RetinaNet (ResNet-101-FPN)
2D ClassificationCOCO test-devAP7542.3RetinaNet (ResNet-101-FPN)
2D ClassificationCOCO test-devAPL50.2RetinaNet (ResNet-101-FPN)
2D ClassificationCOCO test-devAPM42.7RetinaNet (ResNet-101-FPN)
2D ClassificationCOCO test-devAPS21.8RetinaNet (ResNet-101-FPN)
2D ClassificationCOCO test-devbox mAP39.1RetinaNet (ResNet-101-FPN)
2D ClassificationCOCO-OAverage mAP16.6RetinaNet (ResNet-50)
2D ClassificationCOCO-OEffective Robustness0.18RetinaNet (ResNet-50)
2D ClassificationSKU-110KAP45.5RetinaNet
2D ClassificationSKU-110KAP750.389RetinaNet
Pedestrian DetectionTJU-Ped-trafficALL (miss rate)41.4RetinaNet
Pedestrian DetectionTJU-Ped-trafficHO (miss rate)61.6RetinaNet
Pedestrian DetectionTJU-Ped-trafficR (miss rate)23.89RetinaNet
Pedestrian DetectionTJU-Ped-trafficR+HO (miss rate)28.45RetinaNet
Pedestrian DetectionTJU-Ped-trafficRS (miss rate)37.92RetinaNet
Pedestrian DetectionTJU-Ped-campusALL (miss rate)44.34RetinaNet
Pedestrian DetectionTJU-Ped-campusHO (miss rate)71.31RetinaNet
Pedestrian DetectionTJU-Ped-campusR (miss rate)34.73RetinaNet
Pedestrian DetectionTJU-Ped-campusR+HO (miss rate)42.26RetinaNet
Pedestrian DetectionTJU-Ped-campusRS (miss rate)82.99RetinaNet
2D Object DetectionSARDet-100Kbox mAP47.4RetinaNet
2D Object DetectionCOCO test-devAP5061.1RetinaNet (ResNeXt-101-FPN)
2D Object DetectionCOCO test-devAP7544.1RetinaNet (ResNeXt-101-FPN)
2D Object DetectionCOCO test-devAPL51.2RetinaNet (ResNeXt-101-FPN)
2D Object DetectionCOCO test-devAPM44.2RetinaNet (ResNeXt-101-FPN)
2D Object DetectionCOCO test-devAPS24.1RetinaNet (ResNeXt-101-FPN)
2D Object DetectionCOCO test-devbox mAP40.8RetinaNet (ResNeXt-101-FPN)
2D Object DetectionCOCO test-devAP5059.1RetinaNet (ResNet-101-FPN)
2D Object DetectionCOCO test-devAP7542.3RetinaNet (ResNet-101-FPN)
2D Object DetectionCOCO test-devAPL50.2RetinaNet (ResNet-101-FPN)
2D Object DetectionCOCO test-devAPM42.7RetinaNet (ResNet-101-FPN)
2D Object DetectionCOCO test-devAPS21.8RetinaNet (ResNet-101-FPN)
2D Object DetectionCOCO test-devbox mAP39.1RetinaNet (ResNet-101-FPN)
2D Object DetectionCOCO-OAverage mAP16.6RetinaNet (ResNet-50)
2D Object DetectionCOCO-OEffective Robustness0.18RetinaNet (ResNet-50)
2D Object DetectionSKU-110KAP45.5RetinaNet
2D Object DetectionSKU-110KAP750.389RetinaNet
16kCOCO test-devAP5061.1RetinaNet (ResNeXt-101-FPN)
16kCOCO test-devAP7544.1RetinaNet (ResNeXt-101-FPN)
16kCOCO test-devAPL51.2RetinaNet (ResNeXt-101-FPN)
16kCOCO test-devAPM44.2RetinaNet (ResNeXt-101-FPN)
16kCOCO test-devAPS24.1RetinaNet (ResNeXt-101-FPN)
16kCOCO test-devbox mAP40.8RetinaNet (ResNeXt-101-FPN)
16kCOCO test-devAP5059.1RetinaNet (ResNet-101-FPN)
16kCOCO test-devAP7542.3RetinaNet (ResNet-101-FPN)
16kCOCO test-devAPL50.2RetinaNet (ResNet-101-FPN)
16kCOCO test-devAPM42.7RetinaNet (ResNet-101-FPN)
16kCOCO test-devAPS21.8RetinaNet (ResNet-101-FPN)
16kCOCO test-devbox mAP39.1RetinaNet (ResNet-101-FPN)
16kCOCO-OAverage mAP16.6RetinaNet (ResNet-50)
16kCOCO-OEffective Robustness0.18RetinaNet (ResNet-50)
16kSKU-110KAP45.5RetinaNet
16kSKU-110KAP750.389RetinaNet

Related Papers

Visual-Language Model Knowledge Distillation Method for Image Quality Assessment2025-07-21Uncertainty-Aware Cross-Modal Knowledge Distillation with Prototype Learning for Multimodal Brain-Computer Interfaces2025-07-17A Real-Time System for Egocentric Hand-Object Interaction Detection in Industrial Domains2025-07-17RS-TinyNet: Stage-wise Feature Fusion Network for Detecting Tiny Objects in Remote Sensing Images2025-07-17Decoupled PROB: Decoupled Query Initialization Tasks and Objectness-Class Learning for Open World Object Detection2025-07-17Dual LiDAR-Based Traffic Movement Count Estimation at a Signalized Intersection: Deployment, Data Collection, and Preliminary Analysis2025-07-17DVFL-Net: A Lightweight Distilled Video Focal Modulation Network for Spatio-Temporal Action Recognition2025-07-16Vision-based Perception for Autonomous Vehicles in Obstacle Avoidance Scenarios2025-07-16