Domain Generalization on ImageNet-R

Metric: Top-1 Error Rate (lower is better)

LeaderboardDataset

Loading chart...

Results

Hide extra data

Sort:

#	Model↕	Top-1 Error Rate▲	Extra Data	Paper	Date↕	Code
1	Model soups (BASIC-L)	3.9	Yes	Model soups: averaging weights of multiple fine-...	2022-03-10	Code
2	Model soups (ViT-G/14)	4.54	Yes	Model soups: averaging weights of multiple fine-...	2022-03-10	Code
3	CAR-FT (CLIP, ViT-L/14@336px)	10.3	Yes	Context-Aware Robust Fine-Tuning	2022-11-29	-
4	FAN-Hybrid-L(IN-21K, 384))	28.9	Yes	Understanding The Robustness in Vision Transform...	2022-04-26	Code
5	CAFormer-B36 (IN21K, 384)	29.6	Yes	MetaFormer Baselines for Vision	2022-10-24	Code
6	LLE (ViT-B/16, SWAG, Edge Aug)	31.3	Yes	A Whac-A-Mole Dilemma: Shortcuts Come in Multipl...	2022-12-09	Code
7	CAFormer-B36 (IN21K)	31.7	Yes	MetaFormer Baselines for Vision	2022-10-24	Code
8	ConvNeXt-XL (Im21k, 384)	31.8	Yes	A ConvNet for the 2020s	2022-01-10	Code
9	LLE (ViT-H/14, MAE, Edge Aug)	33.1	No	A Whac-A-Mole Dilemma: Shortcuts Come in Multipl...	2022-12-09	Code
10	MAE (ViT-H, 448)	33.5	No	Masked Autoencoders Are Scalable Vision Learners	2021-11-11	Code
11	ConvFormer-B36 (IN21K, 384)	33.5	Yes	MetaFormer Baselines for Vision	2022-10-24	Code
12	MAE+DAT (ViT-H)	34.39	No	Enhance the Visual Representation via Discrete A...	2022-09-16	Code
13	ConvFormer-B36 (IN21K)	34.7	Yes	MetaFormer Baselines for Vision	2022-10-24	Code
14	Discrete Adversarial Distillation (ViT-B,224)	34.9	No	Distilling Out-of-Distribution Robustness from V...	2023-11-02	Code
15	GPaCo (ViT-L)	39.7	No	Generalized Parametric Contrastive Learning	2022-09-26	Code
16	VOLO-D5+HAT	40.3	No	Improving Vision Transformers by Revisiting High...	2022-04-03	Code
17	Pyramid Adversarial Training Improves ViT (Im21k)	42.16	Yes	Pyramid Adversarial Training Improves ViT Perfor...	2021-11-30	Code
18	FAN-L-Hybrid+STL	43.4	No	Fully Attentional Networks with Self-emerging To...	2024-01-08	Code
19	SEER (RegNet10B)	43.9	Yes	Vision Models Are More Robust And Fair When Pret...	2022-02-16	Code
20	DiscreteViT	44.74	No	Discrete Representations Strengthen Vision Trans...	2021-11-20	Code
21	CAFormer-B36 (384)	45	No	MetaFormer Baselines for Vision	2022-10-24	Code
22	Pyramid Adversarial Training Improves ViT	46.08	No	Pyramid Adversarial Training Improves ViT Perfor...	2021-11-30	Code
23	CAFormer-B36	46.1	No	MetaFormer Baselines for Vision	2022-10-24	Code
24	ConvFormer-B36 (384)	47.8	No	MetaFormer Baselines for Vision	2022-10-24	Code
25	ConvFormer-B36	48.9	No	MetaFormer Baselines for Vision	2022-10-24	Code
26	RVT-B*	51.3	No	Towards Robust Vision Transformer	2021-05-17	Code
27	Sequencer2D-L	51.9	No	Sequencer: Deep LSTM for Image Classification	2022-05-04	Code
28	RVT-S*	52.3	No	Towards Robust Vision Transformer	2021-05-17	Code
29	DeepAugment+AugMix (ResNet-50)	53.2	No	The Many Faces of Robustness: A Critical Analysi...	2020-06-29	Code
30	PRIME with JSD (ResNet-50)	53.7	No	PRIME: A few primitives can boost robustness to ...	2021-12-27	Code
31	RVT-Ti*	56.1	No	Towards Robust Vision Transformer	2021-05-17	Code
32	PRIME (ResNet-50)	57.1	No	PRIME: A few primitives can boost robustness to ...	2021-12-27	Code
33	DeepAugment (ResNet-50)	57.8	No	The Many Faces of Robustness: A Critical Analysi...	2020-06-29	Code
34	Stylized ImageNet (ResNet-50)	58.5	Yes	ImageNet-trained CNNs are biased towards texture...	2018-11-29	Code
35	AugMix (ResNet-50)	58.9	No	AugMix: A Simple Data Processing Method to Impro...	2019-12-05	Code
36	ResNet-50	63.9	No	Deep Residual Learning for Image Recognition	2015-12-10	Code
37	ResNet-152x2-SAM	71.9	No	When Vision Transformers Outperform ResNets with...	2021-06-03	Code
38	ViT-B/16-SAM	73.6	No	When Vision Transformers Outperform ResNets with...	2021-06-03	Code
39	Mixer-B/8-SAM	76.5	No	When Vision Transformers Outperform ResNets with...	2021-06-03	Code

#1Model soups (BASIC-L)SOTA
3.9
Top-1 Error Rate· Extra Data· 2022-03-10
Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time Code
#2Model soups (ViT-G/14)SOTA
4.54
Top-1 Error Rate· Extra Data· 2022-03-10
Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time Code
#3CAR-FT (CLIP, ViT-L/14@336px)
10.3
Top-1 Error Rate· Extra Data· 2022-11-29
Context-Aware Robust Fine-Tuning
#4FAN-Hybrid-L(IN-21K, 384))
28.9
Top-1 Error Rate· Extra Data· 2022-04-26
Understanding The Robustness in Vision Transformers Code
#5CAFormer-B36 (IN21K, 384)
29.6
Top-1 Error Rate· Extra Data· 2022-10-24
MetaFormer Baselines for Vision Code
#6LLE (ViT-B/16, SWAG, Edge Aug)
31.3
Top-1 Error Rate· Extra Data· 2022-12-09
A Whac-A-Mole Dilemma: Shortcuts Come in Multiples Where Mitigating One Amplifies Others Code
#7CAFormer-B36 (IN21K)
31.7
Top-1 Error Rate· Extra Data· 2022-10-24
MetaFormer Baselines for Vision Code
#8ConvNeXt-XL (Im21k, 384)SOTA
31.8
Top-1 Error Rate· Extra Data· 2022-01-10
A ConvNet for the 2020s Code
#9LLE (ViT-H/14, MAE, Edge Aug)
33.1
Top-1 Error Rate· 2022-12-09
A Whac-A-Mole Dilemma: Shortcuts Come in Multiples Where Mitigating One Amplifies Others Code
#10MAE (ViT-H, 448)SOTA
33.5
Top-1 Error Rate· 2021-11-11
Masked Autoencoders Are Scalable Vision Learners Code
#11ConvFormer-B36 (IN21K, 384)
33.5
Top-1 Error Rate· Extra Data· 2022-10-24
MetaFormer Baselines for Vision Code
#12MAE+DAT (ViT-H)
34.39
Top-1 Error Rate· 2022-09-16
Enhance the Visual Representation via Discrete Adversarial Training Code
#13ConvFormer-B36 (IN21K)
34.7
Top-1 Error Rate· Extra Data· 2022-10-24
MetaFormer Baselines for Vision Code
#14Discrete Adversarial Distillation (ViT-B,224)
34.9
Top-1 Error Rate· 2023-11-02
Distilling Out-of-Distribution Robustness from Vision-Language Foundation Models Code
#15GPaCo (ViT-L)
39.7
Top-1 Error Rate· 2022-09-26
Generalized Parametric Contrastive Learning Code
#16VOLO-D5+HAT
40.3
Top-1 Error Rate· 2022-04-03
Improving Vision Transformers by Revisiting High-frequency Components Code
#17Pyramid Adversarial Training Improves ViT (Im21k)
42.16
Top-1 Error Rate· Extra Data· 2021-11-30
Pyramid Adversarial Training Improves ViT Performance Code
#18FAN-L-Hybrid+STL
43.4
Top-1 Error Rate· 2024-01-08
Fully Attentional Networks with Self-emerging Token Labeling Code
#19SEER (RegNet10B)
43.9
Top-1 Error Rate· Extra Data· 2022-02-16
Vision Models Are More Robust And Fair When Pretrained On Uncurated Images Without Supervision Code
#20DiscreteViT
44.74
Top-1 Error Rate· 2021-11-20
Discrete Representations Strengthen Vision Transformer Robustness Code
#21CAFormer-B36 (384)
45
Top-1 Error Rate· 2022-10-24
MetaFormer Baselines for Vision Code
#22Pyramid Adversarial Training Improves ViT
46.08
Top-1 Error Rate· 2021-11-30
Pyramid Adversarial Training Improves ViT Performance Code
#23CAFormer-B36
46.1
Top-1 Error Rate· 2022-10-24
MetaFormer Baselines for Vision Code
#24ConvFormer-B36 (384)
47.8
Top-1 Error Rate· 2022-10-24
MetaFormer Baselines for Vision Code
#25ConvFormer-B36
48.9
Top-1 Error Rate· 2022-10-24
MetaFormer Baselines for Vision Code
#26RVT-B*SOTA
51.3
Top-1 Error Rate· 2021-05-17
Towards Robust Vision Transformer Code
#27Sequencer2D-L
51.9
Top-1 Error Rate· 2022-05-04
Sequencer: Deep LSTM for Image Classification Code
#28RVT-S*SOTA
52.3
Top-1 Error Rate· 2021-05-17
Towards Robust Vision Transformer Code
#29DeepAugment+AugMix (ResNet-50)SOTA
53.2
Top-1 Error Rate· 2020-06-29
The Many Faces of Robustness: A Critical Analysis of Out-of-Distribution Generalization Code
#30PRIME with JSD (ResNet-50)
53.7
Top-1 Error Rate· 2021-12-27
PRIME: A few primitives can boost robustness to common corruptions Code
#31RVT-Ti*
56.1
Top-1 Error Rate· 2021-05-17
Towards Robust Vision Transformer Code
#32PRIME (ResNet-50)
57.1
Top-1 Error Rate· 2021-12-27
PRIME: A few primitives can boost robustness to common corruptions Code
#33DeepAugment (ResNet-50)SOTA
57.8
Top-1 Error Rate· 2020-06-29
The Many Faces of Robustness: A Critical Analysis of Out-of-Distribution Generalization Code
#34Stylized ImageNet (ResNet-50)SOTA
58.5
Top-1 Error Rate· Extra Data· 2018-11-29
ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness Code
#35AugMix (ResNet-50)
58.9
Top-1 Error Rate· 2019-12-05
AugMix: A Simple Data Processing Method to Improve Robustness and Uncertainty Code
#36ResNet-50SOTA
63.9
Top-1 Error Rate· 2015-12-10
Deep Residual Learning for Image Recognition Code
#37ResNet-152x2-SAM
71.9
Top-1 Error Rate· 2021-06-03
When Vision Transformers Outperform ResNets without Pre-training or Strong Data Augmentations Code
#38ViT-B/16-SAM
73.6
Top-1 Error Rate· 2021-06-03
When Vision Transformers Outperform ResNets without Pre-training or Strong Data Augmentations Code
#39Mixer-B/8-SAM
76.5
Top-1 Error Rate· 2021-06-03
When Vision Transformers Outperform ResNets without Pre-training or Strong Data Augmentations Code