TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/SNIPER: Efficient Multi-Scale Training

SNIPER: Efficient Multi-Scale Training

Bharat Singh, Mahyar Najibi, Larry S. Davis

2018-05-23NeurIPS 2018 12Region Proposalobject-detectionObject Detection
PaperPDFCodeCodeCode(official)Code

Abstract

We present SNIPER, an algorithm for performing efficient multi-scale training in instance level visual recognition tasks. Instead of processing every pixel in an image pyramid, SNIPER processes context regions around ground-truth instances (referred to as chips) at the appropriate scale. For background sampling, these context-regions are generated using proposals extracted from a region proposal network trained with a short learning schedule. Hence, the number of chips generated per image during training adaptively changes based on the scene complexity. SNIPER only processes 30% more pixels compared to the commonly used single scale training at 800x1333 pixels on the COCO dataset. But, it also observes samples from extreme resolutions of the image pyramid, like 1400x2000 pixels. As SNIPER operates on resampled low resolution chips (512x512 pixels), it can have a batch size as large as 20 on a single GPU even with a ResNet-101 backbone. Therefore it can benefit from batch-normalization during training without the need for synchronizing batch-normalization statistics across GPUs. SNIPER brings training of instance level recognition tasks like object detection closer to the protocol for image classification and suggests that the commonly accepted guideline that it is important to train on high resolution images for instance level visual recognition tasks might not be correct. Our implementation based on Faster-RCNN with a ResNet-101 backbone obtains an mAP of 47.6% on the COCO dataset for bounding box detection and can process 5 images per second during inference with a single GPU. Code is available at https://github.com/MahyarNajibi/SNIPER/.

Results

TaskDatasetMetricValueModel
Object DetectionCOCO test-devAP5067SNIPER (ResNet-101)
Object DetectionCOCO test-devAP7551.6SNIPER (ResNet-101)
Object DetectionCOCO test-devAPL58.1SNIPER (ResNet-101)
Object DetectionCOCO test-devAPM48.9SNIPER (ResNet-101)
Object DetectionCOCO test-devAPS29.6SNIPER (ResNet-101)
Object DetectionCOCO test-devbox mAP46.1SNIPER (ResNet-101)
Object DetectionCOCO test-devAP5065SNIPER (ResNet-50)
Object DetectionCOCO test-devAP7548.6SNIPER (ResNet-50)
Object DetectionCOCO test-devAPL56SNIPER (ResNet-50)
Object DetectionCOCO test-devAPM46.3SNIPER (ResNet-50)
Object DetectionCOCO test-devAPS26.1SNIPER (ResNet-50)
Object DetectionCOCO test-devbox mAP43.5SNIPER (ResNet-50)
3DCOCO test-devAP5067SNIPER (ResNet-101)
3DCOCO test-devAP7551.6SNIPER (ResNet-101)
3DCOCO test-devAPL58.1SNIPER (ResNet-101)
3DCOCO test-devAPM48.9SNIPER (ResNet-101)
3DCOCO test-devAPS29.6SNIPER (ResNet-101)
3DCOCO test-devbox mAP46.1SNIPER (ResNet-101)
3DCOCO test-devAP5065SNIPER (ResNet-50)
3DCOCO test-devAP7548.6SNIPER (ResNet-50)
3DCOCO test-devAPL56SNIPER (ResNet-50)
3DCOCO test-devAPM46.3SNIPER (ResNet-50)
3DCOCO test-devAPS26.1SNIPER (ResNet-50)
3DCOCO test-devbox mAP43.5SNIPER (ResNet-50)
2D ClassificationCOCO test-devAP5067SNIPER (ResNet-101)
2D ClassificationCOCO test-devAP7551.6SNIPER (ResNet-101)
2D ClassificationCOCO test-devAPL58.1SNIPER (ResNet-101)
2D ClassificationCOCO test-devAPM48.9SNIPER (ResNet-101)
2D ClassificationCOCO test-devAPS29.6SNIPER (ResNet-101)
2D ClassificationCOCO test-devbox mAP46.1SNIPER (ResNet-101)
2D ClassificationCOCO test-devAP5065SNIPER (ResNet-50)
2D ClassificationCOCO test-devAP7548.6SNIPER (ResNet-50)
2D ClassificationCOCO test-devAPL56SNIPER (ResNet-50)
2D ClassificationCOCO test-devAPM46.3SNIPER (ResNet-50)
2D ClassificationCOCO test-devAPS26.1SNIPER (ResNet-50)
2D ClassificationCOCO test-devbox mAP43.5SNIPER (ResNet-50)
2D Object DetectionCOCO test-devAP5067SNIPER (ResNet-101)
2D Object DetectionCOCO test-devAP7551.6SNIPER (ResNet-101)
2D Object DetectionCOCO test-devAPL58.1SNIPER (ResNet-101)
2D Object DetectionCOCO test-devAPM48.9SNIPER (ResNet-101)
2D Object DetectionCOCO test-devAPS29.6SNIPER (ResNet-101)
2D Object DetectionCOCO test-devbox mAP46.1SNIPER (ResNet-101)
2D Object DetectionCOCO test-devAP5065SNIPER (ResNet-50)
2D Object DetectionCOCO test-devAP7548.6SNIPER (ResNet-50)
2D Object DetectionCOCO test-devAPL56SNIPER (ResNet-50)
2D Object DetectionCOCO test-devAPM46.3SNIPER (ResNet-50)
2D Object DetectionCOCO test-devAPS26.1SNIPER (ResNet-50)
2D Object DetectionCOCO test-devbox mAP43.5SNIPER (ResNet-50)
16kCOCO test-devAP5067SNIPER (ResNet-101)
16kCOCO test-devAP7551.6SNIPER (ResNet-101)
16kCOCO test-devAPL58.1SNIPER (ResNet-101)
16kCOCO test-devAPM48.9SNIPER (ResNet-101)
16kCOCO test-devAPS29.6SNIPER (ResNet-101)
16kCOCO test-devbox mAP46.1SNIPER (ResNet-101)
16kCOCO test-devAP5065SNIPER (ResNet-50)
16kCOCO test-devAP7548.6SNIPER (ResNet-50)
16kCOCO test-devAPL56SNIPER (ResNet-50)
16kCOCO test-devAPM46.3SNIPER (ResNet-50)
16kCOCO test-devAPS26.1SNIPER (ResNet-50)
16kCOCO test-devbox mAP43.5SNIPER (ResNet-50)

Related Papers

A Real-Time System for Egocentric Hand-Object Interaction Detection in Industrial Domains2025-07-17RS-TinyNet: Stage-wise Feature Fusion Network for Detecting Tiny Objects in Remote Sensing Images2025-07-17Decoupled PROB: Decoupled Query Initialization Tasks and Objectness-Class Learning for Open World Object Detection2025-07-17Dual LiDAR-Based Traffic Movement Count Estimation at a Signalized Intersection: Deployment, Data Collection, and Preliminary Analysis2025-07-17Vision-based Perception for Autonomous Vehicles in Obstacle Avoidance Scenarios2025-07-16Tomato Multi-Angle Multi-Pose Dataset for Fine-Grained Phenotyping2025-07-15ECORE: Energy-Conscious Optimized Routing for Deep Learning Models at the Edge2025-07-08Beyond One Shot, Beyond One Perspective: Cross-View and Long-Horizon Distillation for Better LiDAR Representations2025-07-07