The Many Faces of Robustness: A Critical Analysis of Out-of-Distribution Generalization

Dan Hendrycks, Steven Basart, Norman Mu, Saurav Kadavath, Frank Wang, Evan Dorundo, Rahul Desai, Tyler Zhu, Samyak Parajuli, Mike Guo, Dawn Song, Jacob Steinhardt, Justin Gilmer

2020-06-29ICCV 2021 10Data Augmentation Domain Generalization Out-of-Distribution Generalization

Paper PDF Code(official)

Abstract

We introduce four new real-world distribution shift datasets consisting of changes in image style, image blurriness, geographic location, camera operation, and more. With our new datasets, we take stock of previously proposed methods for improving out-of-distribution robustness and put them to the test. We find that using larger models and artificial data augmentations can improve robustness on real-world distribution shifts, contrary to claims in prior work. We find improvements in artificial robustness benchmarks can transfer to real-world distribution shifts, contrary to claims in prior work. Motivated by our observation that data augmentations can help with real-world distribution shifts, we also introduce a new data augmentation method which advances the state-of-the-art and outperforms models pretrained with 1000 times more labeled data. Overall we find that some methods consistently help with distribution shifts in texture and local image statistics, but these methods do not help with some other distribution shifts like geographic changes. Our results show that future research must study multiple distribution shifts simultaneously, as we demonstrate that no evaluated method consistently improves robustness.

Results

Task	Dataset	Metric	Value	Model
Domain Adaptation	ImageNet-R	Top-1 Error Rate	53.2	DeepAugment+AugMix (ResNet-50)
Domain Adaptation	ImageNet-R	Top-1 Error Rate	57.8	DeepAugment (ResNet-50)
Domain Adaptation	ImageNet-C	mean Corruption Error (mCE)	60.4	DeepAugment (ResNet-50)
Domain Adaptation	VizWiz-Classification	Accuracy - All Images	41.3	ResNet-50 (deepaugment)
Domain Adaptation	VizWiz-Classification	Accuracy - Clean Images	46	ResNet-50 (deepaugment)
Domain Adaptation	VizWiz-Classification	Accuracy - Corrupted Images	34.9	ResNet-50 (deepaugment)
Domain Adaptation	VizWiz-Classification	Accuracy - All Images	40.3	ResNet-50 (deepaugment+augmix)
Domain Adaptation	VizWiz-Classification	Accuracy - Clean Images	44.5	ResNet-50 (deepaugment+augmix)
Domain Adaptation	VizWiz-Classification	Accuracy - Corrupted Images	34.1	ResNet-50 (deepaugment+augmix)
Domain Generalization	ImageNet-R	Top-1 Error Rate	53.2	DeepAugment+AugMix (ResNet-50)
Domain Generalization	ImageNet-R	Top-1 Error Rate	57.8	DeepAugment (ResNet-50)
Domain Generalization	ImageNet-C	mean Corruption Error (mCE)	60.4	DeepAugment (ResNet-50)
Domain Generalization	VizWiz-Classification	Accuracy - All Images	41.3	ResNet-50 (deepaugment)
Domain Generalization	VizWiz-Classification	Accuracy - Clean Images	46	ResNet-50 (deepaugment)
Domain Generalization	VizWiz-Classification	Accuracy - Corrupted Images	34.9	ResNet-50 (deepaugment)
Domain Generalization	VizWiz-Classification	Accuracy - All Images	40.3	ResNet-50 (deepaugment+augmix)
Domain Generalization	VizWiz-Classification	Accuracy - Clean Images	44.5	ResNet-50 (deepaugment+augmix)
Domain Generalization	VizWiz-Classification	Accuracy - Corrupted Images	34.1	ResNet-50 (deepaugment+augmix)

The Many Faces of Robustness: A Critical Analysis of Out-of-Distribution Generalization

Abstract

Results

Related Papers

The Many Faces of Robustness: A Critical Analysis of Out-of-Distribution Generalization

Abstract

Results

Related Papers