Joseph Redmon, Ali Farhadi
We present some updates to YOLO! We made a bunch of little design changes to make it better. We also trained this new network that's pretty swell. It's a little bigger than last time but more accurate. It's still fast though, don't worry. At 320x320 YOLOv3 runs in 22 ms at 28.2 mAP, as accurate as SSD but three times faster. When we look at the old .5 IOU mAP detection metric YOLOv3 is quite good. It achieves 57.9 mAP@50 in 51 ms on a Titan X, compared to 57.5 mAP@50 in 198 ms by RetinaNet, similar performance but 3.8x faster. As always, all the code is online at https://pjreddie.com/yolo/
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Autonomous Vehicles | DVTOD | mAP | 82.7 | YOLOv3 (Thermal) |
| Autonomous Vehicles | DVTOD | mAP | 34.5 | YOLOv3 (Visible) |
| Object Detection | COCO-O | Average mAP | 14.8 | YOLOv3 (DarkNet-53) |
| Object Detection | COCO-O | Effective Robustness | -0.37 | YOLOv3 (DarkNet-53) |
| Object Detection | COCO (Common Objects in Context) | box AP | 33 | YOLOv3-L |
| Object Detection | Cityscapes | mPC [AP] | 16.9 | Photometric distortion |
| 3D | COCO-O | Average mAP | 14.8 | YOLOv3 (DarkNet-53) |
| 3D | COCO-O | Effective Robustness | -0.37 | YOLOv3 (DarkNet-53) |
| 3D | COCO (Common Objects in Context) | box AP | 33 | YOLOv3-L |
| 3D | Cityscapes | mPC [AP] | 16.9 | Photometric distortion |
| 2D Classification | COCO-O | Average mAP | 14.8 | YOLOv3 (DarkNet-53) |
| 2D Classification | COCO-O | Effective Robustness | -0.37 | YOLOv3 (DarkNet-53) |
| 2D Classification | COCO (Common Objects in Context) | box AP | 33 | YOLOv3-L |
| 2D Classification | Cityscapes | mPC [AP] | 16.9 | Photometric distortion |
| Pedestrian Detection | DVTOD | mAP | 82.7 | YOLOv3 (Thermal) |
| Pedestrian Detection | DVTOD | mAP | 34.5 | YOLOv3 (Visible) |
| 2D Object Detection | COCO-O | Average mAP | 14.8 | YOLOv3 (DarkNet-53) |
| 2D Object Detection | COCO-O | Effective Robustness | -0.37 | YOLOv3 (DarkNet-53) |
| 2D Object Detection | COCO (Common Objects in Context) | box AP | 33 | YOLOv3-L |
| 2D Object Detection | Cityscapes | mPC [AP] | 16.9 | Photometric distortion |
| 16k | COCO-O | Average mAP | 14.8 | YOLOv3 (DarkNet-53) |
| 16k | COCO-O | Effective Robustness | -0.37 | YOLOv3 (DarkNet-53) |
| 16k | COCO (Common Objects in Context) | box AP | 33 | YOLOv3-L |
| 16k | Cityscapes | mPC [AP] | 16.9 | Photometric distortion |