Metric: Top 1 Accuracy (higher is better)
| # | Model↕ | Top 1 Accuracy▼ | Extra Data | Paper | Date↕ | Code |
|---|---|---|---|---|---|---|
| 1 | ViT-MoE-15B (Every-2) | 68.66 | No | Scaling Vision with Sparse Mixture of Experts | 2021-06-10 | Code |
| 2 | MAWS (ViT-6.5B) | 63.6 | Yes | The effectiveness of MAE pre-pretraining for bil... | 2023-03-23 | Code |
| 3 | V-MoE-H/14 (Every-2) | 63.38 | No | Scaling Vision with Sparse Mixture of Experts | 2021-06-10 | Code |
| 4 | V-MoE-H/14 (Last-5) | 62.95 | No | Scaling Vision with Sparse Mixture of Experts | 2021-06-10 | Code |
| 5 | V-MoE-L/16 (Every-2) | 62.41 | No | Scaling Vision with Sparse Mixture of Experts | 2021-06-10 | Code |
| 6 | VIT-H/14 | 62.34 | No | Scaling Vision with Sparse Mixture of Experts | 2021-06-10 | Code |
| 7 | MAWS (ViT-2B) | 62.1 | Yes | The effectiveness of MAE pre-pretraining for bil... | 2023-03-23 | Code |
| 8 | MAWS (ViT-H) | 57.1 | Yes | The effectiveness of MAE pre-pretraining for bil... | 2023-03-23 | Code |