SSD: Single Shot MultiBox Detector

Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang Fu, Alexander C. Berg

2015-12-08Visual Object Tracking Surgical tool detection Node Property Prediction Object Detection LIDAR Semantic Segmentation Low-Light Image Enhancement

Abstract

We present a method for detecting objects in images using a single deep neural network. Our approach, named SSD, discretizes the output space of bounding boxes into a set of default boxes over different aspect ratios and scales per feature map location. At prediction time, the network generates scores for the presence of each object category in each default box and produces adjustments to the box to better match the object shape. Additionally, the network combines predictions from multiple feature maps with different resolutions to naturally handle objects of various sizes. Our SSD model is simple relative to methods that require object proposals because it completely eliminates proposal generation and subsequent pixel or feature resampling stage and encapsulates all computation in a single network. This makes SSD easy to train and straightforward to integrate into systems that require a detection component. Experimental results on the PASCAL VOC, MS COCO, and ILSVRC datasets confirm that SSD has comparable accuracy to methods that utilize an additional object proposal step and is much faster, while providing a unified framework for both training and inference. Compared to other single stage methods, SSD has much better accuracy, even with a smaller input image size. For $300\times 300$ input, SSD achieves 72.1% mAP on VOC2007 test at 58 FPS on a Nvidia Titan X and for $500\times 500$ input, SSD achieves 75.1% mAP, outperforming a comparable state of the art Faster R-CNN model. Code is available at https://github.com/weiliu89/caffe/tree/ssd .

Results

Task	Dataset	Metric	Value	Model
Object Detection	COCO-O	Average mAP	13.6	SSD (VGG-16)
Object Detection	COCO-O	Effective Robustness	0.36	SSD (VGG-16)
Object Detection	PKU-DDD17-Car	mAP50	73.1	SSD
Object Detection	PASCAL VOC 2012	MAP	80	SSD512 (07+12+COCO)
3D	COCO-O	Average mAP	13.6	SSD (VGG-16)
3D	COCO-O	Effective Robustness	0.36	SSD (VGG-16)
3D	PKU-DDD17-Car	mAP50	73.1	SSD
3D	PASCAL VOC 2012	MAP	80	SSD512 (07+12+COCO)
2D Classification	COCO-O	Average mAP	13.6	SSD (VGG-16)
2D Classification	COCO-O	Effective Robustness	0.36	SSD (VGG-16)
2D Classification	PKU-DDD17-Car	mAP50	73.1	SSD
2D Classification	PASCAL VOC 2012	MAP	80	SSD512 (07+12+COCO)
2D Object Detection	COCO-O	Average mAP	13.6	SSD (VGG-16)
2D Object Detection	COCO-O	Effective Robustness	0.36	SSD (VGG-16)
2D Object Detection	PKU-DDD17-Car	mAP50	73.1	SSD
2D Object Detection	PASCAL VOC 2012	MAP	80	SSD512 (07+12+COCO)
16k	COCO-O	Average mAP	13.6	SSD (VGG-16)
16k	COCO-O	Effective Robustness	0.36	SSD (VGG-16)
16k	PKU-DDD17-Car	mAP50	73.1	SSD
16k	PASCAL VOC 2012	MAP	80	SSD512 (07+12+COCO)

SSD: Single Shot MultiBox Detector

Abstract

Results

Related Papers

SSD: Single Shot MultiBox Detector

Abstract

Results

Related Papers