Metric: Top-5 Accuracy (higher is better)
| # | Model↕ | Top-5 Accuracy▼ | Extra Data | Paper | Date↕ | Code |
|---|---|---|---|---|---|---|
| 1 | ViT-H/14 | 82.1 | Yes | An Image is Worth 16x16 Words: Transformers for ... | 2020-10-22 | Code |
| 2 | BiT-L (ResNet-152x4) | 80 | Yes | Big Transfer (BiT): General Visual Representatio... | 2019-12-24 | Code |
| 3 | AR-L (Opt Relevance) | 73.5 | Yes | Optimizing Relevance Maps of Vision Transformers... | 2022-06-02 | Code |
| 4 | AR-B (Opt Relevance) | 70 | Yes | Optimizing Relevance Maps of Vision Transformers... | 2022-06-02 | Code |
| 5 | BiT-M (ResNet-152x4) | 69 | Yes | Big Transfer (BiT): General Visual Representatio... | 2019-12-24 | Code |
| 6 | AR-L | 68.3 | Yes | Optimizing Relevance Maps of Vision Transformers... | 2022-06-02 | Code |
| 7 | ViT-L (Opt Relevance) | 65.8 | Yes | Optimizing Relevance Maps of Vision Transformers... | 2022-06-02 | Code |
| 8 | ViT-B (Opt Relevance) | 65.1 | Yes | Optimizing Relevance Maps of Vision Transformers... | 2022-06-02 | Code |
| 9 | AR-B | 63.7 | Yes | Optimizing Relevance Maps of Vision Transformers... | 2022-06-02 | Code |
| 10 | AR-S (Opt Relevance) | 61.7 | Yes | Optimizing Relevance Maps of Vision Transformers... | 2022-06-02 | Code |
| 11 | ResNet-152 + GenInt with Transfer | 61.43 | Yes | Generative Interventions for Causal Learning | 2020-12-22 | Code |
| 12 | ViT-L | 59.5 | Yes | Optimizing Relevance Maps of Vision Transformers... | 2022-06-02 | Code |
| 13 | BiT-S (ResNet-152x4) | 57 | Yes | Big Transfer (BiT): General Visual Representatio... | 2019-12-24 | Code |
| 14 | DeiT-L (Opt Relevance) | 56.6 | Yes | Optimizing Relevance Maps of Vision Transformers... | 2022-06-02 | Code |
| 15 | ViT-B | 56.4 | Yes | Optimizing Relevance Maps of Vision Transformers... | 2022-06-02 | Code |
| 16 | NASNet-A | 56.05 | Yes | - | - | - |
| 17 | AR-S | 55.8 | Yes | Optimizing Relevance Maps of Vision Transformers... | 2022-06-02 | Code |
| 18 | PNASNet-5L | 54.95 | Yes | - | - | - |
| 19 | DeiT-S (Opt Relevance) | 53 | Yes | Optimizing Relevance Maps of Vision Transformers... | 2022-06-02 | Code |
| 20 | Inception-v4 | 51.98 | Yes | - | - | - |
| 21 | ResNet-50 + GroupNorm | 50.2 | Yes | Improving robustness against common corruptions ... | 2020-06-30 | Code |
| 22 | ResNet-50 + CGC | 50.16 | Yes | Context-Gated Convolution | 2019-10-12 | Code |
| 23 | ResNet-152 | 49.4 | Yes | - | - | - |
| 24 | SeLa(v2) (reverse linear probing) | 48.83 | Yes | - | - | - |
| 25 | ResNet-50 + FixUp | 48.6 | Yes | Improving robustness against common corruptions ... | 2020-06-30 | Code |
| 26 | DeiT-L | 48.5 | Yes | Optimizing Relevance Maps of Vision Transformers... | 2022-06-02 | Code |
| 27 | ResNet-18 + GenInt with Transfer | 48.02 | Yes | Generative Interventions for Causal Learning | 2020-12-22 | Code |
| 28 | DeiT-S | 47.3 | Yes | Optimizing Relevance Maps of Vision Transformers... | 2022-06-02 | Code |
| 29 | DeepCluster(v2) (reverse linear probing) | 46.81 | Yes | - | - | - |
| 30 | SwAV (reverse linear probing) | 43.64 | Yes | - | - | - |
| 31 | VGG-14 | 37.15 | Yes | - | - | - |
| 32 | OBoW (reverse linear probing) | 31.72 | Yes | - | - | - |
| 33 | MoCHi (reverse linear probing) | 31.71 | Yes | - | - | - |
| 34 | MoCo(v2) (reverse linear probing) | 31.45 | Yes | - | - | - |
| 35 | ResNet-152 (FRCNN-ag-ad, VOC) | 29.7 | Yes | Class-agnostic Object Detection | 2020-11-28 | - |
| 36 | AlexNet | 17.6 | Yes | - | - | - |