Karen Simonyan, Andrew Zisserman
In this work we investigate the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting. Our main contribution is a thorough evaluation of networks of increasing depth using an architecture with very small (3x3) convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers. These findings were the basis of our ImageNet Challenge 2014 submission, where our team secured the first and the second places in the localisation and classification tracks respectively. We also show that our representations generalise well to other datasets, where they achieve state-of-the-art results. We have made our two best-performing ConvNet models publicly available to facilitate further research on the use of deep visual representations in computer vision.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Depth Estimation | SiW-Enroll5 | AUC | 97.8 | VGG16 |
| Depth Estimation | CelebA-Spoof-Enroll5 | AUC | 98 | VGG16 |
| Facial Recognition and Modelling | SiW-Enroll5 | AUC | 97.8 | VGG16 |
| Facial Recognition and Modelling | CelebA-Spoof-Enroll5 | AUC | 98 | VGG16 |
| Image-to-Image Translation | GTAV-to-Cityscapes Labels | mIoU | 41.3 | VGG16 60.3 |
| Domain Adaptation | VizWiz-Classification | Accuracy - All Images | 36.7 | VGG-16 BN |
| Domain Adaptation | VizWiz-Classification | Accuracy - Clean Images | 41.1 | VGG-16 BN |
| Domain Adaptation | VizWiz-Classification | Accuracy - Corrupted Images | 31.1 | VGG-16 BN |
| Domain Adaptation | VizWiz-Classification | Accuracy - All Images | 36.2 | VGG-19 BN |
| Domain Adaptation | VizWiz-Classification | Accuracy - Clean Images | 40.8 | VGG-19 BN |
| Domain Adaptation | VizWiz-Classification | Accuracy - Corrupted Images | 29.4 | VGG-19 BN |
| Domain Adaptation | VizWiz-Classification | Accuracy - All Images | 34.7 | VGG-19 |
| Domain Adaptation | VizWiz-Classification | Accuracy - Clean Images | 39.3 | VGG-19 |
| Domain Adaptation | VizWiz-Classification | Accuracy - Corrupted Images | 29 | VGG-19 |
| Domain Adaptation | VizWiz-Classification | Accuracy - All Images | 34.7 | VGG-16 |
| Domain Adaptation | VizWiz-Classification | Accuracy - Clean Images | 39.5 | VGG-16 |
| Domain Adaptation | VizWiz-Classification | Accuracy - Corrupted Images | 28.5 | VGG-16 |
| Domain Adaptation | VizWiz-Classification | Accuracy - All Images | 33.7 | VGG-13 BN |
| Domain Adaptation | VizWiz-Classification | Accuracy - Clean Images | 38.4 | VGG-13 BN |
| Domain Adaptation | VizWiz-Classification | Accuracy - Corrupted Images | 28.3 | VGG-13 BN |
| Domain Adaptation | VizWiz-Classification | Accuracy - All Images | 32.9 | VGG-11 BN |
| Domain Adaptation | VizWiz-Classification | Accuracy - Clean Images | 37.1 | VGG-11 BN |
| Domain Adaptation | VizWiz-Classification | Accuracy - Corrupted Images | 25.8 | VGG-11 BN |
| Domain Adaptation | VizWiz-Classification | Accuracy - All Images | 32.4 | VGG-13 |
| Domain Adaptation | VizWiz-Classification | Accuracy - Clean Images | 36.5 | VGG-13 |
| Domain Adaptation | VizWiz-Classification | Accuracy - Corrupted Images | 26.4 | VGG-13 |
| Domain Adaptation | VizWiz-Classification | Accuracy - All Images | 31.5 | VGG-11 |
| Domain Adaptation | VizWiz-Classification | Accuracy - Clean Images | 36.1 | VGG-11 |
| Domain Adaptation | VizWiz-Classification | Accuracy - Corrupted Images | 25.2 | VGG-11 |
| Image Generation | GTAV-to-Cityscapes Labels | mIoU | 41.3 | VGG16 60.3 |
| Visual Odometry | SiW-Enroll5 | AUC | 97.8 | VGG16 |
| Visual Odometry | CelebA-Spoof-Enroll5 | AUC | 98 | VGG16 |
| Face Reconstruction | SiW-Enroll5 | AUC | 97.8 | VGG16 |
| Face Reconstruction | CelebA-Spoof-Enroll5 | AUC | 98 | VGG16 |
| 3D | SiW-Enroll5 | AUC | 97.8 | VGG16 |
| 3D | CelebA-Spoof-Enroll5 | AUC | 98 | VGG16 |
| 3D Face Modelling | SiW-Enroll5 | AUC | 97.8 | VGG16 |
| 3D Face Modelling | CelebA-Spoof-Enroll5 | AUC | 98 | VGG16 |
| 3D Face Reconstruction | SiW-Enroll5 | AUC | 97.8 | VGG16 |
| 3D Face Reconstruction | CelebA-Spoof-Enroll5 | AUC | 98 | VGG16 |
| Depth And Camera Motion | SiW-Enroll5 | AUC | 97.8 | VGG16 |
| Depth And Camera Motion | CelebA-Spoof-Enroll5 | AUC | 98 | VGG16 |
| Classification | XImageNet-12 | Robustness Score | 0.8845 | VGG-16 |
| Domain Generalization | VizWiz-Classification | Accuracy - All Images | 36.7 | VGG-16 BN |
| Domain Generalization | VizWiz-Classification | Accuracy - Clean Images | 41.1 | VGG-16 BN |
| Domain Generalization | VizWiz-Classification | Accuracy - Corrupted Images | 31.1 | VGG-16 BN |
| Domain Generalization | VizWiz-Classification | Accuracy - All Images | 36.2 | VGG-19 BN |
| Domain Generalization | VizWiz-Classification | Accuracy - Clean Images | 40.8 | VGG-19 BN |
| Domain Generalization | VizWiz-Classification | Accuracy - Corrupted Images | 29.4 | VGG-19 BN |
| Domain Generalization | VizWiz-Classification | Accuracy - All Images | 34.7 | VGG-19 |
| Domain Generalization | VizWiz-Classification | Accuracy - Clean Images | 39.3 | VGG-19 |
| Domain Generalization | VizWiz-Classification | Accuracy - Corrupted Images | 29 | VGG-19 |
| Domain Generalization | VizWiz-Classification | Accuracy - All Images | 34.7 | VGG-16 |
| Domain Generalization | VizWiz-Classification | Accuracy - Clean Images | 39.5 | VGG-16 |
| Domain Generalization | VizWiz-Classification | Accuracy - Corrupted Images | 28.5 | VGG-16 |
| Domain Generalization | VizWiz-Classification | Accuracy - All Images | 33.7 | VGG-13 BN |
| Domain Generalization | VizWiz-Classification | Accuracy - Clean Images | 38.4 | VGG-13 BN |
| Domain Generalization | VizWiz-Classification | Accuracy - Corrupted Images | 28.3 | VGG-13 BN |
| Domain Generalization | VizWiz-Classification | Accuracy - All Images | 32.9 | VGG-11 BN |
| Domain Generalization | VizWiz-Classification | Accuracy - Clean Images | 37.1 | VGG-11 BN |
| Domain Generalization | VizWiz-Classification | Accuracy - Corrupted Images | 25.8 | VGG-11 BN |
| Domain Generalization | VizWiz-Classification | Accuracy - All Images | 32.4 | VGG-13 |
| Domain Generalization | VizWiz-Classification | Accuracy - Clean Images | 36.5 | VGG-13 |
| Domain Generalization | VizWiz-Classification | Accuracy - Corrupted Images | 26.4 | VGG-13 |
| Domain Generalization | VizWiz-Classification | Accuracy - All Images | 31.5 | VGG-11 |
| Domain Generalization | VizWiz-Classification | Accuracy - Clean Images | 36.1 | VGG-11 |
| Domain Generalization | VizWiz-Classification | Accuracy - Corrupted Images | 25.2 | VGG-11 |
| 1 Image, 2*2 Stitching | GTAV-to-Cityscapes Labels | mIoU | 41.3 | VGG16 60.3 |