Claudio Michaelis, Benjamin Mitzkus, Robert Geirhos, Evgenia Rusak, Oliver Bringmann, Alexander S. Ecker, Matthias Bethge, Wieland Brendel
The ability to detect objects regardless of image distortions or weather conditions is crucial for real-world applications of deep learning like autonomous driving. We here provide an easy-to-use benchmark to assess how object detection models perform when image quality degrades. The three resulting benchmark datasets, termed Pascal-C, Coco-C and Cityscapes-C, contain a large variety of image corruptions. We show that a range of standard object detection models suffer a severe performance loss on corrupted images (down to 30--60\% of the original performance). However, a simple data augmentation trick---stylizing the training images---leads to a substantial increase in robustness across corruption type, severity and dataset. We envision our comprehensive benchmark to track future progress towards building robust object detection models. Benchmark, code and data are publicly available.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Object Detection | PASCAL VOC 2007 | mPC [AP50] | 56.2 | Faster R-CNN with Stylized Training Data |
| Object Detection | PASCAL VOC 2007 | rPC [%] | 69.9 | Faster R-CNN with Stylized Training Data |
| Object Detection | PASCAL VOC 2007 | mPC [AP50] | 48.6 | Faster R-CNN |
| Object Detection | PASCAL VOC 2007 | rPC [%] | 60.4 | Faster R-CNN |
| Object Detection | Cityscapes | mPC [AP] | 17.2 | Stylized Training Data |
| Object Detection | Cityscapes test | mPC [AP] | 17.2 | Faster R-CNN with Stylized Training Data |
| Object Detection | Cityscapes test | rPC [%] | 47.4 | Faster R-CNN with Stylized Training Data |
| Object Detection | Cityscapes test | mPC [AP] | 12.2 | Faster R-CNN |
| Object Detection | Cityscapes test | rPC [%] | 33.4 | Faster R-CNN |
| Object Detection | COCO (Common Objects in Context) | mPC [AP] | 20.4 | Faster R-CNN with Stylized Training Data |
| Object Detection | COCO (Common Objects in Context) | rPC [%] | 58.9 | Faster R-CNN with Stylized Training Data |
| Object Detection | COCO (Common Objects in Context) | mPC [AP] | 18.2 | Faster R-CNN |
| Object Detection | COCO (Common Objects in Context) | rPC [%] | 50.2 | Faster R-CNN |
| 3D | PASCAL VOC 2007 | mPC [AP50] | 56.2 | Faster R-CNN with Stylized Training Data |
| 3D | PASCAL VOC 2007 | rPC [%] | 69.9 | Faster R-CNN with Stylized Training Data |
| 3D | PASCAL VOC 2007 | mPC [AP50] | 48.6 | Faster R-CNN |
| 3D | PASCAL VOC 2007 | rPC [%] | 60.4 | Faster R-CNN |
| 3D | Cityscapes | mPC [AP] | 17.2 | Stylized Training Data |
| 3D | Cityscapes test | mPC [AP] | 17.2 | Faster R-CNN with Stylized Training Data |
| 3D | Cityscapes test | rPC [%] | 47.4 | Faster R-CNN with Stylized Training Data |
| 3D | Cityscapes test | mPC [AP] | 12.2 | Faster R-CNN |
| 3D | Cityscapes test | rPC [%] | 33.4 | Faster R-CNN |
| 3D | COCO (Common Objects in Context) | mPC [AP] | 20.4 | Faster R-CNN with Stylized Training Data |
| 3D | COCO (Common Objects in Context) | rPC [%] | 58.9 | Faster R-CNN with Stylized Training Data |
| 3D | COCO (Common Objects in Context) | mPC [AP] | 18.2 | Faster R-CNN |
| 3D | COCO (Common Objects in Context) | rPC [%] | 50.2 | Faster R-CNN |
| 2D Classification | PASCAL VOC 2007 | mPC [AP50] | 56.2 | Faster R-CNN with Stylized Training Data |
| 2D Classification | PASCAL VOC 2007 | rPC [%] | 69.9 | Faster R-CNN with Stylized Training Data |
| 2D Classification | PASCAL VOC 2007 | mPC [AP50] | 48.6 | Faster R-CNN |
| 2D Classification | PASCAL VOC 2007 | rPC [%] | 60.4 | Faster R-CNN |
| 2D Classification | Cityscapes | mPC [AP] | 17.2 | Stylized Training Data |
| 2D Classification | Cityscapes test | mPC [AP] | 17.2 | Faster R-CNN with Stylized Training Data |
| 2D Classification | Cityscapes test | rPC [%] | 47.4 | Faster R-CNN with Stylized Training Data |
| 2D Classification | Cityscapes test | mPC [AP] | 12.2 | Faster R-CNN |
| 2D Classification | Cityscapes test | rPC [%] | 33.4 | Faster R-CNN |
| 2D Classification | COCO (Common Objects in Context) | mPC [AP] | 20.4 | Faster R-CNN with Stylized Training Data |
| 2D Classification | COCO (Common Objects in Context) | rPC [%] | 58.9 | Faster R-CNN with Stylized Training Data |
| 2D Classification | COCO (Common Objects in Context) | mPC [AP] | 18.2 | Faster R-CNN |
| 2D Classification | COCO (Common Objects in Context) | rPC [%] | 50.2 | Faster R-CNN |
| 2D Object Detection | PASCAL VOC 2007 | mPC [AP50] | 56.2 | Faster R-CNN with Stylized Training Data |
| 2D Object Detection | PASCAL VOC 2007 | rPC [%] | 69.9 | Faster R-CNN with Stylized Training Data |
| 2D Object Detection | PASCAL VOC 2007 | mPC [AP50] | 48.6 | Faster R-CNN |
| 2D Object Detection | PASCAL VOC 2007 | rPC [%] | 60.4 | Faster R-CNN |
| 2D Object Detection | Cityscapes | mPC [AP] | 17.2 | Stylized Training Data |
| 2D Object Detection | Cityscapes test | mPC [AP] | 17.2 | Faster R-CNN with Stylized Training Data |
| 2D Object Detection | Cityscapes test | rPC [%] | 47.4 | Faster R-CNN with Stylized Training Data |
| 2D Object Detection | Cityscapes test | mPC [AP] | 12.2 | Faster R-CNN |
| 2D Object Detection | Cityscapes test | rPC [%] | 33.4 | Faster R-CNN |
| 2D Object Detection | COCO (Common Objects in Context) | mPC [AP] | 20.4 | Faster R-CNN with Stylized Training Data |
| 2D Object Detection | COCO (Common Objects in Context) | rPC [%] | 58.9 | Faster R-CNN with Stylized Training Data |
| 2D Object Detection | COCO (Common Objects in Context) | mPC [AP] | 18.2 | Faster R-CNN |
| 2D Object Detection | COCO (Common Objects in Context) | rPC [%] | 50.2 | Faster R-CNN |
| 16k | PASCAL VOC 2007 | mPC [AP50] | 56.2 | Faster R-CNN with Stylized Training Data |
| 16k | PASCAL VOC 2007 | rPC [%] | 69.9 | Faster R-CNN with Stylized Training Data |
| 16k | PASCAL VOC 2007 | mPC [AP50] | 48.6 | Faster R-CNN |
| 16k | PASCAL VOC 2007 | rPC [%] | 60.4 | Faster R-CNN |
| 16k | Cityscapes | mPC [AP] | 17.2 | Stylized Training Data |
| 16k | Cityscapes test | mPC [AP] | 17.2 | Faster R-CNN with Stylized Training Data |
| 16k | Cityscapes test | rPC [%] | 47.4 | Faster R-CNN with Stylized Training Data |
| 16k | Cityscapes test | mPC [AP] | 12.2 | Faster R-CNN |
| 16k | Cityscapes test | rPC [%] | 33.4 | Faster R-CNN |
| 16k | COCO (Common Objects in Context) | mPC [AP] | 20.4 | Faster R-CNN with Stylized Training Data |
| 16k | COCO (Common Objects in Context) | rPC [%] | 58.9 | Faster R-CNN with Stylized Training Data |
| 16k | COCO (Common Objects in Context) | mPC [AP] | 18.2 | Faster R-CNN |
| 16k | COCO (Common Objects in Context) | rPC [%] | 50.2 | Faster R-CNN |