TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Training Region-based Object Detectors with Online Hard Ex...

Training Region-based Object Detectors with Online Hard Example Mining

Abhinav Shrivastava, Abhinav Gupta, Ross Girshick

2016-04-12CVPR 2016 6object-detectionObject Detection
PaperPDFCodeCodeCodeCodeCode

Abstract

The field of object detection has made significant advances riding on the wave of region-based ConvNets, but their training procedure still includes many heuristics and hyperparameters that are costly to tune. We present a simple yet surprisingly effective online hard example mining (OHEM) algorithm for training region-based ConvNet detectors. Our motivation is the same as it has always been -- detection datasets contain an overwhelming number of easy examples and a small number of hard examples. Automatic selection of these hard examples can make training more effective and efficient. OHEM is a simple and intuitive algorithm that eliminates several heuristics and hyperparameters in common use. But more importantly, it yields consistent and significant boosts in detection performance on benchmarks like PASCAL VOC 2007 and 2012. Its effectiveness increases as datasets become larger and more difficult, as demonstrated by the results on the MS COCO dataset. Moreover, combined with complementary advances in the field, OHEM leads to state-of-the-art results of 78.9% and 76.3% mAP on PASCAL VOC 2007 and 2012 respectively.

Results

TaskDatasetMetricValueModel
Facial Recognition and ModellingTrillion Pairs DatasetAccuracy34.46HM-Softmax
Facial Recognition and ModellingTrillion Pairs DatasetAccuracy36.75HM-Softmax
Face VerificationTrillion Pairs DatasetAccuracy34.46HM-Softmax
Face ReconstructionTrillion Pairs DatasetAccuracy34.46HM-Softmax
Face ReconstructionTrillion Pairs DatasetAccuracy36.75HM-Softmax
3DTrillion Pairs DatasetAccuracy34.46HM-Softmax
3DTrillion Pairs DatasetAccuracy36.75HM-Softmax
3D Face ModellingTrillion Pairs DatasetAccuracy34.46HM-Softmax
3D Face ModellingTrillion Pairs DatasetAccuracy36.75HM-Softmax
3D Face ReconstructionTrillion Pairs DatasetAccuracy34.46HM-Softmax
3D Face ReconstructionTrillion Pairs DatasetAccuracy36.75HM-Softmax

Related Papers

A Real-Time System for Egocentric Hand-Object Interaction Detection in Industrial Domains2025-07-17RS-TinyNet: Stage-wise Feature Fusion Network for Detecting Tiny Objects in Remote Sensing Images2025-07-17Decoupled PROB: Decoupled Query Initialization Tasks and Objectness-Class Learning for Open World Object Detection2025-07-17Dual LiDAR-Based Traffic Movement Count Estimation at a Signalized Intersection: Deployment, Data Collection, and Preliminary Analysis2025-07-17Vision-based Perception for Autonomous Vehicles in Obstacle Avoidance Scenarios2025-07-16Tomato Multi-Angle Multi-Pose Dataset for Fine-Grained Phenotyping2025-07-15ECORE: Energy-Conscious Optimized Routing for Deep Learning Models at the Edge2025-07-08Beyond One Shot, Beyond One Perspective: Cross-View and Long-Horizon Distillation for Better LiDAR Representations2025-07-07