| 1 | Model soups (BASIC-L) | 94.17 | Yes | Model soups: averaging weights of multiple fine-... | 2022-03-10 | Code |
| 2 | Model soups (ViT-G/14) | 92.67 | Yes | Model soups: averaging weights of multiple fine-... | 2022-03-10 | Code |
| 3 | µ2Net+ (ViT-L/16) | 84.53 | Yes | A Continual Development Methodology for Large-sc... | 2022-09-15 | Code |
| 4 | CAR-FT (CLIP, ViT-L/14@336px) | 81.5 | Yes | Context-Aware Robust Fine-Tuning | 2022-11-29 | - |
| 5 | CAFormer-B36 (IN-21K, 384) | 79.5 | Yes | MetaFormer Baselines for Vision | 2022-10-24 | Code |
| 6 | MAE (ViT-H, 448) | 76.7 | No | Masked Autoencoders Are Scalable Vision Learners | 2021-11-11 | Code |
| 7 | FAN-Hybrid-L(IN-21K, 384) | 74.5 | Yes | Understanding The Robustness in Vision Transform... | 2022-04-26 | Code |
| 8 | ConvFormer-B36 (IN-21K, 384) | 73.5 | Yes | MetaFormer Baselines for Vision | 2022-10-24 | Code |
| 9 | CAFormer-B36 (IN-21K) | 69.4 | Yes | MetaFormer Baselines for Vision | 2022-10-24 | Code |
| 10 | ConvNeXt-XL (Im21k, 384) | 69.3 | Yes | A ConvNet for the 2020s | 2022-01-10 | Code |
| 11 | MAE+DAT (ViT-H) | 68.92 | No | Enhance the Visual Representation via Discrete A... | 2022-09-16 | Code |
| 12 | ConvFormer-B36 (IN-21K) | 63.3 | Yes | MetaFormer Baselines for Vision | 2022-10-24 | Code |
| 13 | Pyramid Adversarial Training Improves ViT (Im21k) | 62.44 | Yes | Pyramid Adversarial Training Improves ViT Perfor... | 2021-11-30 | Code |
| 14 | CAFormer-B36 (384) | 61.9 | No | MetaFormer Baselines for Vision | 2022-10-24 | Code |
| 15 | TransNeXt-Base (IN-1K supervised, 384) | 61.6 | No | TransNeXt: Robust Foveal Visual Perception for V... | 2023-11-28 | Code |
| 16 | TransNeXt-Small (IN-1K supervised, 384) | 58.3 | No | TransNeXt: Robust Foveal Visual Perception for V... | 2023-11-28 | Code |
| 17 | ConvFormer-B36 (384) | 55.3 | No | MetaFormer Baselines for Vision | 2022-10-24 | Code |
| 18 | SEER (RegNet10B) | 52.7 | Yes | Vision Models Are More Robust And Fair When Pret... | 2022-02-16 | Code |
| 19 | TransNeXt-Base (IN-1K supervised, 224) | 50.6 | No | TransNeXt: Robust Foveal Visual Perception for V... | 2023-11-28 | Code |
| 20 | CAFormer-B36 | 48.5 | No | MetaFormer Baselines for Vision | 2022-10-24 | Code |
| 21 | TransNeXt-Small (IN-1K supervised, 224) | 47.1 | No | TransNeXt: Robust Foveal Visual Perception for V... | 2023-11-28 | Code |
| 22 | FAN-L-Hybrid+STL | 46.1 | No | Fully Attentional Networks with Self-emerging To... | 2024-01-08 | Code |
| 23 | ConvFormer-B36 | 40.1 | No | MetaFormer Baselines for Vision | 2022-10-24 | Code |
| 24 | Pyramid Adversarial Training Improves ViT (384x384) | 36.41 | No | Pyramid Adversarial Training Improves ViT Perfor... | 2021-11-30 | Code |
| 25 | Sequencer2D-L | 35.5 | No | Sequencer: Deep LSTM for Image Classification | 2022-05-04 | Code |
| 26 | Discrete Adversarial Distillation (ViT-B/224) | 31.8 | No | Distilling Out-of-Distribution Robustness from V... | 2023-11-02 | Code |
| 27 | Diffusion Classifier | 30.2 | No | Your Diffusion Model is Secretly a Zero-Shot Cla... | 2023-03-28 | Code |
| 28 | RVT-B* | 28.5 | No | Towards Robust Vision Transformer | 2021-05-17 | Code |
| 29 | RVT-S* | 25.7 | No | Towards Robust Vision Transformer | 2021-05-17 | Code |
| 30 | RVT-Ti* | 14.4 | No | Towards Robust Vision Transformer | 2021-05-17 | Code |
| 31 | GFNet-S | 14.3 | No | Global Filter Networks for Image Classification | 2021-07-01 | Code |
| 32 | CutMix+MoEx (ResNet-50) | 8.4 | No | On Feature Normalization and Data Augmentation | 2020-02-25 | Code |
| 33 | Discrete Adversarial Distillation (ResNet-50) | 7.7 | No | Distilling Out-of-Distribution Robustness from V... | 2023-11-02 | Code |
| 34 | CutMix (ResNet-50) | 7.3 | No | CutMix: Regularization Strategy to Train Strong ... | 2019-05-13 | Code |
| 35 | Mixup (ResNet-50) | 6.6 | No | mixup: Beyond Empirical Risk Minimization | 2017-10-25 | Code |
| 36 | Cutout (ResNet-50) | 4.4 | No | Improved Regularization of Convolutional Neural ... | 2017-08-15 | Code |
| 37 | ResNet-50 (300 Epochs) | 4.2 | No | Deep Residual Learning for Image Recognition | 2015-12-10 | Code |
| 38 | Stylized ImageNet (ResNet-50) | 2.3 | Yes | ImageNet-trained CNNs are biased towards texture... | 2018-11-29 | Code |
| 39 | ResNet-50 | 0 | No | - | - | Code |