| 1 | Unicom (ViT-L/14@336px) (Finetuned) | 88.3 | No | Unicom: Universal and Compact Representation Lea... | 2023-04-12 | Code |
| 2 | Bamboo (Bamboo-H) | 87.1 | No | A Study on Transformer Configuration and Trainin... | 2022-05-21 | - |
| 3 | DINOv2+reg (ViT-g/14) | 87.1 | Yes | Vision Transformers Need Registers | 2023-09-28 | Code |
| 4 | Bamboo (Bamboo-L) | 86.3 | No | A Study on Transformer Configuration and Trainin... | 2022-05-21 | - |
| 5 | TinySaver(ConvNeXtV2_h, 0.01 Acc drop) | 86.24 | No | Tiny Models are the Computational Saver for Larg... | 2024-03-26 | Code |
| 6 | Refiner-ViT-L | 86.03 | No | Refiner: Refining Self-attention for Vision Tran... | 2021-06-07 | Code |
| 7 | TinySaver(ConvNeXtV2_h, 0.5 Acc drop) | 85.75 | No | Tiny Models are the Computational Saver for Larg... | 2024-03-26 | Code |
| 8 | TinySaver(Swin_large, 0.5 Acc drop) | 85.74 | No | Tiny Models are the Computational Saver for Larg... | 2024-03-26 | Code |
| 9 | TinySaver(Swin_large, 1.0 Acc drop) | 85.24 | No | Tiny Models are the Computational Saver for Larg... | 2024-03-26 | Code |
| 10 | Bamboo (Bamboo-B) | 84.2 | No | A Study on Transformer Configuration and Trainin... | 2022-05-21 | - |
| 11 | AIM-7B | 84 | No | Scalable Pre-training of Large Autoregressive Im... | 2024-01-16 | Code |
| 12 | DynamicViT-LV-M/0.8 | 83.9 | No | DynamicViT: Efficient Vision Transformers with D... | 2021-06-03 | Code |
| 13 | TinySaver(EfficientFormerV2_l, 0.01 Acc drop) | 83.52 | No | Tiny Models are the Computational Saver for Larg... | 2024-03-26 | Code |
| 14 | KAT-B* | 82.8 | No | Kolmogorov-Arnold Transformer | 2024-09-16 | Code |
| 15 | ReViT-B | 82.4 | No | ReViT: Enhancing Vision Transformers Feature Div... | 2024-02-17 | Code |
| 16 | ConvNeXt-T-Hermite | 82.34 | No | Polynomial, trigonometric, and tropical activati... | 2025-02-03 | Code |
| 17 | ConvMixer-1536/20 | 82.2 | No | Patches Are All You Need? | 2022-01-24 | Code |
| 18 | DIFFQ (λ=1e−2) | 82 | No | Differentiable Model Compression via Pseudo Quan... | 2021-04-20 | Code |
| 19 | DeiT-B | 81.8 | No | Kolmogorov-Arnold Transformer | 2024-09-16 | Code |
| 20 | EsViT (Swin-B) | 81.3 | No | Efficient Self-supervised Vision Transformers fo... | 2021-06-17 | Code |
| 21 | SimpleNetV1-9m-correct-labels | 81.24 | No | Lets keep it simple, Using simple architectures ... | 2016-08-22 | Code |
| 22 | ResNeXt-101 (Debiased+CutMix) | 81.2 | No | Shape-Texture Debiased Neural Network Training | 2020-10-12 | Code |
| 23 | EsViT(Swin-S) | 80.8 | No | Efficient Self-supervised Vision Transformers fo... | 2021-06-17 | Code |
| 24 | SimpleNetV1-5m-correct-labels | 79.12 | No | Lets keep it simple, Using simple architectures ... | 2016-08-22 | Code |
| 25 | ViT-B/16 | 79.1 | No | Kolmogorov-Arnold Transformer | 2024-09-16 | Code |
| 26 | Inception V3 | 78.8 | No | - | - | - |
| 27 | CSAT | 78.6 | No | - | - | - |
| 28 | ConvMLP-S | 76.8 | No | ConvMLP: Hierarchical Convolutional MLPs for Vis... | 2021-09-09 | Code |
| 29 | VGG | 76.3 | Yes | - | - | - |
| 30 | ELP (naive ResNet50) | 76.13 | No | - | - | Code |
| 31 | SimpleNetV1-small-075-correct-labels | 75.66 | No | Lets keep it simple, Using simple architectures ... | 2016-08-22 | Code |
| 32 | FF | 74.9 | No | Do You Even Need Attention? A Stack of Feed-Forw... | 2021-05-06 | Code |
| 33 | SimpleNetV1-9m | 74.17 | No | Lets keep it simple, Using simple architectures ... | 2016-08-22 | Code |
| 34 | VICReg (ResNet50) | 73.2 | No | VICReg: Variance-Invariance-Covariance Regulariz... | 2021-05-11 | Code |
| 35 | I-VNE+ (ResNet-50) | 72.1 | No | VNE: An Effective Method for Improving Deep Repr... | 2023-04-04 | Code |
| 36 | SimpleNetV1-5m | 71.94 | No | Lets keep it simple, Using simple architectures ... | 2016-08-22 | Code |
| 37 | Dspike (VGG-16) | 71.24 | Yes | - | - | - |
| 38 | PSN (SEW ResNet-34) | 70.54 | No | - | - | - |
| 39 | GAC-SNN MS-ResNet-34 | 70.42 | No | Gated Attention Coding for Training High-perform... | 2023-08-12 | Code |
| 40 | SimpleNetV1-small-05-correct-labels | 69.11 | No | Lets keep it simple, Using simple architectures ... | 2016-08-22 | Code |
| 41 | SimpleNetV1-small-075 | 68.15 | No | Lets keep it simple, Using simple architectures ... | 2016-08-22 | Code |
| 42 | PSN (SEW ResNet-18) | 67.63 | No | - | - | - |
| 43 | OverFeat | 66.04 | Yes | - | - | - |
| 44 | DGPPF-ResNet18 | 65.22 | No | - | - | Code |
| 45 | Alexnet | 63.3 | Yes | - | - | - |
| 46 | SimpleNetV1-small-05 | 61.52 | No | Lets keep it simple, Using simple architectures ... | 2016-08-22 | Code |
| 47 | DeepCluster (AlexNet) | 41 | No | Deep Clustering for Unsupervised Learning of Vis... | 2018-07-15 | Code |
| 48 | Colorisation (improved) (ResNet-101) | 39.6 | No | Multi-task Self-Supervised Visual Learning | 2017-08-25 | - |
| 49 | NFResnet-50 | 39.2 | No | TAN Without a Burn: Scaling Laws of DP-SGD | 2022-10-07 | Code |
| 50 | Rotation (AlexNet) | 38.7 | No | Unsupervised Representation Learning by Predicti... | 2018-03-21 | Code |
| 51 | Counting (AlexNet) | 34.3 | No | Representation Learning by Learning to Count | 2017-08-22 | Code |
| 52 | NFResnet-50 | 32.4 | No | Unlocking High-Accuracy Differentially Private I... | 2022-04-28 | Code |
| 53 | Resnet-18 | 6.9 | No | Toward Training at ImageNet Scale with Different... | 2022-01-28 | Code |
| 54 | Resnet-50 | 5 | No | Toward Training at ImageNet Scale with Different... | 2022-01-28 | Code |