| 1 | ViT-H/14 | 99.5 | Yes | An Image is Worth 16x16 Words: Transformers for ... | 2020-10-22 | Code |
| 2 | DINOv2 (ViT-g/14, frozen model, linear eval) | 99.5 | Yes | DINOv2: Learning Robust Visual Features without ... | 2023-04-14 | Code |
| 3 | µ2Net (ViT-L/16) | 99.49 | Yes | An Evolutionary Approach to Dynamic Introduction... | 2022-05-25 | Code |
| 4 | ViT-L/16 | 99.42 | Yes | An Image is Worth 16x16 Words: Transformers for ... | 2020-10-22 | Code |
| 5 | CaiT-M-36 U 224 | 99.4 | Yes | Going deeper with Image Transformers | 2021-03-31 | Code |
| 6 | CvT-W24 | 99.39 | Yes | CvT: Introducing Convolutions to Vision Transfor... | 2021-03-29 | Code |
| 7 | BiT-L (ResNet) | 99.37 | Yes | Big Transfer (BiT): General Visual Representatio... | 2019-12-24 | Code |
| 8 | RDNet-L (224 res, IN-1K pretrained) | 99.31 | Yes | DenseNets Reloaded: Paradigm Shift Beyond ResNet... | 2024-03-28 | Code |
| 9 | RDNet-B (224 res, IN-1K pretrained) | 99.31 | Yes | DenseNets Reloaded: Paradigm Shift Beyond ResNet... | 2024-03-28 | Code |
| 10 | ViT-B (attn fine-tune) | 99.3 | Yes | Three things everyone should know about Vision T... | 2022-03-18 | Code |
| 11 | Heinsen Routing + BEiT-large 16 224 | 99.2 | Yes | An Algorithm for Routing Vectors in Sequences | 2022-11-20 | Code |
| 12 | ViT-B/16 (PUGD) | 99.13 | Yes | Perturbated Gradients Updating within Unit Space... | 2021-10-01 | Code |
| 13 | Astroformer | 99.12 | Yes | Astroformer: More Data Might not be all you need... | 2023-04-03 | Code |
| 14 | DeiT-B | 99.1 | Yes | Training data-efficient image transformers & dis... | 2020-12-23 | Code |
| 15 | TNT-B | 99.1 | Yes | Transformer in Transformer | 2021-02-27 | Code |
| 16 | CeiT-S (384 finetune resolution) | 99.1 | Yes | Incorporating Convolution Designs into Visual Tr... | 2021-03-22 | Code |
| 17 | EfficientNetV2-L | 99.1 | Yes | EfficientNetV2: Smaller Models and Faster Training | 2021-04-01 | Code |
| 18 | AutoFormer-S | 384 | 99.1 | Yes | AutoFormer: Searching Transformers for Visual Re... | 2021-07-01 | Code |
| 19 | VIT-L/16 (Spinal FC, Background) | 99.05 | No | Reduction of Class Activation Uncertainty with B... | 2023-05-05 | Code |
| 20 | LaNet | 99.03 | No | - | - | Code |
| 21 | GPIPE + transfer learning | 99 | Yes | GPipe: Efficient Training of Giant Neural Networ... | 2018-11-16 | Code |
| 22 | TResNet-XL | 99 | No | TResNet: High Performance GPU-Dedicated Architec... | 2020-03-30 | Code |
| 23 | CeiT-S | 99 | No | Incorporating Convolution Designs into Visual Tr... | 2021-03-22 | Code |
| 24 | EfficientNetV2-M | 99 | No | EfficientNetV2: Smaller Models and Faster Training | 2021-04-01 | Code |
| 25 | GFNet-H-B | 99 | Yes | Global Filter Networks for Image Classification | 2021-07-01 | Code |
| 26 | BiT-M (ResNet) | 98.91 | No | Big Transfer (BiT): General Visual Representatio... | 2019-12-24 | Code |
| 27 | EfficientNet-B7 | 98.9 | No | EfficientNet: Rethinking Model Scaling for Convo... | 2019-05-28 | Code |
| 28 | RDNet-T (224 res, IN-1K pretrained) | 98.88 | No | DenseNets Reloaded: Paradigm Shift Beyond ResNet... | 2024-03-28 | Code |
| 29 | PyramidNet-272, S=4 | 98.71 | No | Towards Better Accuracy-efficiency Trade-offs: D... | 2020-11-30 | Code |
| 30 | EfficientNetV2-S | 98.7 | No | EfficientNetV2: Smaller Models and Faster Training | 2021-04-01 | Code |
| 31 | ASF-former-S | 98.7 | No | Adaptive Split-Fusion Transformer | 2022-04-26 | Code |
| 32 | PyramidNet-272 (ASAM) | 98.68 | No | ASAM: Adaptive Sharpness-Aware Minimization for ... | 2021-02-23 | Code |
| 33 | PyramidNet + ShakeDrop + Fast AA + FMix | 98.64 | No | FMix: Enhancing Mixed Sample Data Augmentation | 2020-02-27 | Code |
| 34 | ViT-B/16- SAM | 98.6 | No | When Vision Transformers Outperform ResNets with... | 2021-06-03 | Code |
| 35 | ConvMLP-M | 98.6 | Yes | ConvMLP: Hierarchical Convolutional MLPs for Vis... | 2021-09-09 | Code |
| 36 | ConvMLP-L | 98.6 | Yes | ConvMLP: Hierarchical Convolutional MLPs for Vis... | 2021-09-09 | Code |
| 37 | DVT (T2T-ViT-24) | 98.53 | No | Not All Images are Worth 16x16 Words: Dynamic Tr... | 2021-05-31 | Code |
| 38 | E2E-3M | 98.52 | No | Rethinking Recurrent Neural Networks and Other I... | 2020-07-30 | Code |
| 39 | CeiT-T | 98.5 | No | Incorporating Convolution Designs into Visual Tr... | 2021-03-22 | Code |
| 40 | NAT-M4 | 98.4 | Yes | Neural Architecture Transfer | 2020-05-12 | Code |
| 41 | WRN-40-10, S=4 | 98.38 | No | Towards Better Accuracy-efficiency Trade-offs: D... | 2020-11-30 | Code |
| 42 | WRN-28-10, S=4 | 98.32 | No | Towards Better Accuracy-efficiency Trade-offs: D... | 2020-11-30 | Code |
| 43 | Shake-Shake 26 2x96d, S=4 | 98.31 | No | Towards Better Accuracy-efficiency Trade-offs: D... | 2020-11-30 | Code |
| 44 | Dynamics 2 | 98.31 | No | PSO-Convolutional Neural Networks with Heterogen... | 2022-05-20 | Code |
| 45 | PyramidNet+ShakeDrop (Fast AA) | 98.3 | No | Fast AutoAugment | 2019-05-01 | Code |
| 46 | ResNet50 (A1) | 98.3 | No | ResNet strikes back: An improved training proced... | 2021-10-01 | Code |
| 47 | NoisyDARTS-A-t | 98.28 | No | Noisy Differentiable Architecture Search | 2020-05-07 | Code |
| 48 | NAT-M3 | 98.2 | Yes | Neural Architecture Transfer | 2020-05-12 | Code |
| 49 | LeViT-192 | 98.2 | No | LeViT: a Vision Transformer in ConvNet's Clothin... | 2021-04-02 | Code |
| 50 | ResNet-152-SAM | 98.2 | No | When Vision Transformers Outperform ResNets with... | 2021-06-03 | Code |
| 51 | ViT-S/16- SAM | 98.2 | No | When Vision Transformers Outperform ResNets with... | 2021-06-03 | Code |
| 52 | Bamboo (ViT-B/16) | 98.2 | No | Bamboo: Building Mega-Scale Vision Dataset Conti... | 2022-03-15 | Code |
| 53 | DE ELBo (ViT-B/16) | 98.2 | No | Learning Hyperparameters via a Data-Emphasized V... | 2025-02-03 | Code |
| 54 | LeViT-256 | 98.1 | Yes | LeViT: a Vision Transformer in ConvNet's Clothin... | 2021-04-02 | Code |
| 55 | PyramidNet + AA (AMP) | 98.02 | No | Regularizing Neural Networks via Adversarial Mod... | 2020-10-10 | Code |
| 56 | EnAET | 98.01 | No | EnAET: A Self-Trained framework for Semi-Supervi... | 2019-11-21 | Code |
| 57 | MUXNet-m | 98 | No | MUXConv: Information Multiplexing in Convolution... | 2020-03-31 | Code |
| 58 | LeViT-384 | 98 | No | LeViT: a Vision Transformer in ConvNet's Clothin... | 2021-04-02 | Code |
| 59 | CCT-7/3x1* | 98 | No | Escaping the Big Data Paradigm with Compact Tran... | 2021-04-12 | Code |
| 60 | ConvMLP-S | 98 | Yes | ConvMLP: Hierarchical Convolutional MLPs for Vis... | 2021-09-09 | Code |
| 61 | Proxyless-G + c/o | 97.92 | No | ProxylessNAS: Direct Neural Architecture Search ... | 2018-12-02 | Code |
| 62 | NAT-M2 | 97.9 | Yes | Neural Architecture Transfer | 2020-05-12 | Code |
| 63 | WRN-28-10+AutoDropout+RandAugment | 97.9 | No | AutoDropout: Learning Dropout Patterns to Regula... | 2021-01-05 | Code |
| 64 | SENet + ShakeShake + Cutout | 97.88 | No | Squeeze-and-Excitation Networks | 2017-09-05 | Code |
| 65 | HCGNet-A3 | 97.86 | No | Gated Convolutional Networks with Hybrid Connect... | 2019-08-26 | Code |
| 66 | Wide-ResNet-28-10 | 97.85 | No | Automatic Data Augmentation via Invariance-Const... | 2022-09-29 | Code |
| 67 | ResNeXt-50 (AutoMix) | 97.84 | No | AutoMix: Unveiling the Power of Mixup for Strong... | 2021-03-24 | Code |
| 68 | ResNet-152x4-AGC (ImageNet-21K) | 97.82 | No | Effect of Pre-Training Scale on Intra- and Inter... | 2021-05-31 | Code |
| 69 | Mixer-B/16- SAM | 97.8 | No | When Vision Transformers Outperform ResNets with... | 2021-06-03 | Code |
| 70 | CCT-7/3x1+VTM | 97.78 | No | TokenMixup: Efficient Attention-guided Token-lev... | 2022-10-14 | Code |
| 71 | WRN-28-10 | 97.73 | No | MixMo: Mixing Multiple Inputs for Multiple Outpu... | 2021-03-10 | Code |
| 72 | HCGNet-A2 | 97.71 | No | Gated Convolutional Networks with Hybrid Connect... | 2019-08-26 | Code |
| 73 | WRN + fixup init + mixup + cutout | 97.7 | No | Fixup Initialization: Residual Learning Without ... | 2019-01-27 | Code |
| 74 | NoisyDARTS-a | 97.61 | No | Noisy Differentiable Architecture Search | 2020-05-07 | Code |
| 75 | TransBoost-ResNet50 | 97.61 | No | TransBoost: Improving the Best ImageNet Performa... | 2022-05-26 | Code |
| 76 | LeViT-128 | 97.6 | No | LeViT: a Vision Transformer in ConvNet's Clothin... | 2021-04-02 | Code |
| 77 | DenseNet-BC-190 + batchboost | 97.54 | No | batchboost: regularization for stabilizing train... | 2020-01-21 | Code |
| 78 | LeViT-128S | 97.5 | Yes | LeViT: a Vision Transformer in ConvNet's Clothin... | 2021-04-02 | Code |
| 79 | Shared WRN | 97.47 | No | Learning Implicitly Recurrent CNNs Through Param... | 2019-02-26 | Code |
| 80 | Manifold Mixup WRN 28-10 | 97.45 | No | Manifold Mixup: Better Representations by Interp... | 2018-06-13 | Code |
| 81 | WRN 28-14 | 97.45 | No | Neural networks with late-phase weights | 2020-07-25 | Code |
| 82 | SparseSwin | 97.43 | No | SparseSwin: Swin Transformer with Sparse Transfo... | 2023-09-11 | Code |
| 83 | WRN-28-10 with reSGHMC | 97.42 | No | Non-convex Learning via Replica Exchange Stochas... | 2020-08-12 | Code |
| 84 | NAT-M1 | 97.4 | Yes | Neural Architecture Transfer | 2020-05-12 | Code |
| 85 | ResNet-50-SAM | 97.4 | Yes | When Vision Transformers Outperform ResNets with... | 2021-06-03 | Code |
| 86 | DenseNet-BC-190 + Mixup | 97.3 | Yes | mixup: Beyond Empirical Risk Minimization | 2017-10-25 | Code |
| 87 | kNN-CLIP | 97.3 | Yes | Revisiting a kNN-based Image Classification Syst... | 2022-04-03 | - |
| 88 | WaveMixLite-144/7 | 97.29 | No | WaveMix: A Resource-efficient Neural Network for... | 2022-05-28 | Code |
| 89 | Transformer local-attention (NesT-B) | 97.2 | No | Nested Hierarchical Transformer: Towards Accurat... | 2021-05-26 | Code |
| 90 | ShakeShake-2x64d + SWA | 97.12 | No | Averaging Weights Leads to Wider Optima and Bett... | 2018-03-14 | Code |
| 91 | PyramidNet-200 + CutMix | 97.12 | No | CutMix: Regularization Strategy to Train Strong ... | 2019-05-13 | Code |
| 92 | Wide-ResNet-40-2 | 97.05 | No | Automatic Data Augmentation via Invariance-Const... | 2022-09-29 | Code |
| 93 | ORN | 97.02 | No | Oriented Response Networks | 2017-01-07 | Code |
| 94 | WRN-16-8 with reSGHMC | 96.87 | No | Non-convex Learning via Replica Exchange Stochas... | 2020-08-12 | Code |
| 95 | ResNet_XnIDR | 96.87 | No | XnODR and XnIDR: Two Accurate and Fast Fully Con... | 2021-11-21 | Code |
| 96 | HCGNet-A1 | 96.85 | No | Gated Convolutional Networks with Hybrid Connect... | 2019-08-26 | Code |
| 97 | WRN 28-10 | 96.81 | No | Neural networks with late-phase weights | 2020-07-25 | Code |
| 98 | AutoDropout | 96.8 | No | AutoDropout: Learning Dropout Patterns to Regula... | 2021-01-05 | Code |
| 99 | WRN-28-10 + SWA | 96.79 | No | Averaging Weights Leads to Wider Optima and Bett... | 2018-03-14 | Code |
| 100 | ConvMixer-256/16 | 96.74 | No | Patches Are All You Need? | 2022-01-24 | Code |
| 101 | EXACT (WRN-28-10) | 96.73 | No | EXACT: How to Train Your Accuracy | 2022-05-19 | Code |
| 102 | Wide ResNet+cutout | 96.71 | No | Single-bit-per-weight deep convolutional neural ... | 2019-07-16 | Code |
| 103 | Deep pyramidal residual network | 96.69 | No | Deep Pyramidal Residual Networks | 2016-10-10 | Code |
| 104 | CoPaNet-R-164 | 96.62 | No | Deep Competitive Pathway Networks | 2017-09-29 | Code |
| 105 | DenseNet (DenseNet-BC-190) | 96.54 | No | Densely Connected Convolutional Networks | 2016-08-25 | Code |
| 106 | SKNet-29 (ResNeXt-29, 16×32d) | 96.53 | No | Selective Kernel Networks | 2019-03-15 | Code |
| 107 | Fractional MP | 96.5 | No | Fractional Max-Pooling | 2014-12-18 | Code |
| 108 | PDO-eConv (p8, 4.6M) | 96.5 | No | PDO-eConvs: Partial Differential Operator Based ... | 2020-07-20 | Code |
| 109 | UPANets | 96.47 | No | UPANets: Learning from the Universal Pixel Atten... | 2021-03-15 | Code |
| 110 | GAC-SNN | 96.46 | No | Gated Attention Coding for Training High-perform... | 2023-08-12 | Code |
| 111 | ViT (lightweight, MAE pretrained) | 96.41 | No | Pre-training of Lightweight Vision Transformers ... | 2024-02-06 | - |
| 112 | NAS-RL | 96.4 | No | Neural Architecture Search with Reinforcement Le... | 2016-11-05 | Code |
| 113 | VGG11B(2x) + LocalLearning + CO | 96.4 | No | Training Neural Networks with Local Error Signals | 2019-01-20 | Code |
| 114 | ABNet-2G-R3-Combined | 96.378 | No | ANDHRA Bandersnatch: Training Neural Networks to... | 2024-11-28 | Code |
| 115 | Residual Gates + WRN | 96.35 | No | Learning Identity Mappings with Residual Gates | 2016-11-04 | - |
| 116 | PDO-eConv (p8, 2.62M) | 96.32 | No | PDO-eConvs: Partial Differential Operator Based ... | 2020-07-20 | Code |
| 117 | SimpleNetv2 | 96.29 | No | Towards Principled Design of Deep Convolutional ... | 2018-02-17 | Code |
| 118 | ResNet56 with reSGHMC | 96.12 | No | Non-convex Learning via Replica Exchange Stochas... | 2020-08-12 | Code |
| 119 | Mixer-S/16- SAM | 96.1 | No | When Vision Transformers Outperform ResNets with... | 2021-06-03 | Code |
| 120 | ABNet-2G-R3 | 96.088 | No | ANDHRA Bandersnatch: Training Neural Networks to... | 2024-11-28 | Code |
| 121 | PreActResNet18 (AMP) | 96.03 | No | Regularizing Neural Networks via Adversarial Mod... | 2020-10-10 | Code |
| 122 | ConvMixer-256/8 | 96.03 | No | Patches Are All You Need? | 2022-01-24 | Code |
| 123 | Local Mixup Resnet18 | 95.97 | No | Preventing Manifold Intrusion with Locality: Loc... | 2022-01-12 | Code |
| 124 | ABNet-2G-R2 | 95.9 | No | ANDHRA Bandersnatch: Training Neural Networks to... | 2024-11-28 | Code |
| 125 | ResNet-50x1-ACG (ImageNet-21K) | 95.78 | No | Effect of Pre-Training Scale on Intra- and Inter... | 2021-05-31 | Code |
| 126 | ResNet18 (FSGDM) | 95.66 | No | On the Performance Analysis of Momentum Method: ... | 2024-11-29 | Code |
| 127 | ACN | 95.6 | No | Striving for Simplicity: The All Convolutional Net | 2014-12-21 | Code |
| 128 | Evolution ensemble | 95.6 | No | Large-Scale Evolution of Image Classifiers | 2017-03-03 | Code |
| 129 | ResNet-18 | 95.55 | No | Benchopt: Reproducible, efficient and collaborat... | 2022-06-27 | Code |
| 130 | ABNet-2G-R1 | 95.536 | No | ANDHRA Bandersnatch: Training Neural Networks to... | 2024-11-28 | Code |
| 131 | SimpleNetv1 | 95.51 | No | Lets keep it simple, Using simple architectures ... | 2016-08-22 | Code |
| 132 | Mobile Net_Sam | 95.5 | No | MobileNetV2: Inverted Residuals and Linear Bottl... | 2018-01-13 | Code |
| 133 | IM-Loss (ResNet-19) | 95.49 | No | - | - | - |
| 134 | ResNet-1001 | 95.4 | No | Identity Mappings in Deep Residual Networks | 2016-03-16 | Code |
| 135 | ResNet32 with reSGHMC | 95.35 | No | Non-convex Learning via Replica Exchange Stochas... | 2020-08-12 | Code |
| 136 | ResNet-18+MM+FRL | 95.33 | No | Learning Class Unique Features in Fine-Grained V... | 2020-11-22 | - |
| 137 | PSN (Modified PLIF Net) | 95.32 | No | - | - | - |
| 138 | CCT-6/3x1 | 95.29 | No | Escaping the Big Data Paradigm with Compact Tran... | 2021-04-12 | Code |
| 139 | MomentumNet | 95.18 | No | Momentum Residual Neural Networks | 2021-02-15 | Code |
| 140 | Context-Aware Pipeline | 95.16 | No | - | - | Code |
| 141 | SRM-ResNet-56 | 95.05 | No | SRM : A Style-based Recalibration Module for Con... | 2019-03-26 | Code |
| 142 | MixMatch | 95.05 | No | MixMatch: A Holistic Approach to Semi-Supervised... | 2019-05-06 | Code |
| 143 | WRN-22-8 (Sparse Momentum) | 95.04 | No | Sparse Networks from Scratch: Faster Training wi... | 2019-07-10 | Code |
| 144 | LP-BNN (ours) + cutout | 95.02 | No | Encoding the latent posterior of Bayesian Neural... | 2020-12-04 | Code |
| 145 | kEffNet-B0 V2 32ch + H Flip | 94.95 | No | - | - | Code |
| 146 | Prodpoly | 94.9 | No | Deep Polynomial Neural Networks | 2020-06-20 | Code |
| 147 | ResNet-9 | 94.79 | Yes | CNN Filter DB: An Empirical Investigation of Tra... | 2022-03-29 | Code |
| 148 | Stochastic Depth | 94.77 | Yes | Deep Networks with Stochastic Depth | 2016-03-30 | Code |
| 149 | VGG-19 with GradInit | 94.71 | No | GradInit: Learning to Initialize Neural Networks... | 2021-02-16 | Code |
| 150 | ResNet20 with reSGHMC | 94.62 | No | Non-convex Learning via Replica Exchange Stochas... | 2020-08-12 | Code |
| 151 | PDO-eConv (p6m,0.37M) | 94.62 | No | PDO-eConvs: Partial Differential Operator Based ... | 2020-07-20 | Code |
| 152 | Evolution | 94.6 | No | Large-Scale Evolution of Image Classifiers | 2017-03-03 | Code |
| 153 | RL+NT | 94.6 | No | Efficient Architecture Search by Network Transfo... | 2017-07-16 | Code |
| 154 | Convolutional Performer for Vision (CPV) | 94.46 | No | Convolutional Xformers for Vision | 2022-01-25 | Code |
| 155 | PreResNet-110 | 94.4367 | No | How to Use Dropout Correctly on Residual Network... | 2023-02-13 | Code |
| 156 | ResNet+ELU | 94.4 | No | Deep Residual Networks with Exponential Linear U... | 2016-04-14 | Code |
| 157 | Deep Complex | 94.4 | No | Deep Complex Networks | 2017-05-27 | Code |
| 158 | PDO-eConv (p6,0.36M) | 94.35 | No | PDO-eConvs: Partial Differential Operator Based ... | 2020-07-20 | Code |
| 159 | Stochastic Optimization of Plain Convolutional Neural Networks with Simple methods | 94.29 | No | Stochastic Optimization of Plain Convolutional N... | 2020-01-24 | Code |
| 160 | Fitnet4-LSUV | 94.2 | No | All you need is a good init | 2015-11-19 | Code |
| 161 | R-ExplaiNet-26 | 94.15 | No | Learning local discrete features in explainable-... | 2024-10-31 | Code |
| 162 | ABNet-2G-R0 | 94.118 | No | ANDHRA Bandersnatch: Training Neural Networks to... | 2024-11-28 | Code |
| 163 | ResNet 9 + Mish | 94.05 | No | Mish: A Self Regularized Non-Monotonic Activatio... | 2019-08-23 | Code |
| 164 | Tree+Max-Avg pooling | 94 | No | Generalizing Pooling Functions in Convolutional ... | 2015-09-30 | Code |
| 165 | Beta-Rank | 93.97 | No | Beta-Rank: A Robust Convolutional Filter Pruning... | 2023-04-15 | Code |
| 166 | ResNet-110 (SAP) | 93.861 | No | Stochastic Subsampling With Average Pooling | 2024-09-25 | - |
| 167 | SA quadratic embedding | 93.8 | No | On the Relationship between Self-Attention and C... | 2019-11-08 | Code |
| 168 | kEffNet-B0 32ch | 93.75 | No | - | - | Code |
| 169 | OTTT | 93.73 | No | Online Training Through Time for Spiking Neural ... | 2022-10-09 | Code |
| 170 | SSCNN | 93.7 | No | Spatially-sparse convolutional neural networks | 2014-09-22 | Code |
| 171 | NNCLR | 93.7 | No | With a Little Help from My Friends: Nearest-Neig... | 2021-04-29 | Code |
| 172 | Tuned CNN | 93.6 | No | Scalable Bayesian Optimization Using Deep Neural... | 2015-02-19 | Code |
| 173 | Exponential Linear Units | 93.5 | No | Fast and Accurate Deep Network Learning by Expon... | 2015-11-23 | Code |
| 174 | BNM NiN | 93.3 | No | Batch-normalized Maxout Network in Network | 2015-11-09 | Code |
| 175 | Universum Prescription | 93.3 | No | Universum Prescription: Regularization using Unl... | 2015-11-11 | - |
| 176 | CMsC | 93.1 | No | Competitive Multi-scale Convolution | 2015-11-18 | - |
| 177 | DGPPF-ResNet18 | 92.9 | No | - | - | Code |
| 178 | kMobileNet V3 Large 16ch | 92.74 | No | - | - | Code |
| 179 | NiN+APL | 92.5 | No | Learning Activation Functions to Improve Deep Ne... | 2014-12-21 | Code |
| 180 | VDN | 92.4 | No | Training Very Deep Networks | 2015-07-22 | Code |
| 181 | ResNet | 92.3 | No | A Bregman Learning Framework for Sparse Neural N... | 2021-05-10 | Code |
| 182 | SWWAE | 92.2 | No | Stacked What-Where Auto-encoders | 2015-06-08 | Code |
| 183 | FlexTCN-7 | 92.2 | No | FlexConv: Continuous Kernel Convolutions with Di... | 2021-10-15 | Code |
| 184 | ReActNet-18 | 92.08 | No | "BNN - BN = ?": Training Binary Neural Networks ... | 2021-04-16 | Code |
| 185 | ResNet v2-20 (Mish activation) | 92.02 | No | Mish: A Self Regularized Non-Monotonic Activatio... | 2019-08-23 | Code |
| 186 | Context-Aware DNN tree | 92.01 | No | - | - | - |
| 187 | DSN | 91.8 | No | Deeply-Supervised Nets | 2014-09-18 | Code |
| 188 | BinaryConnect | 91.7 | No | BinaryConnect: Training Deep Neural Networks wit... | 2015-11-02 | Code |
| 189 | CLS-GAN | 91.7 | No | Loss-Sensitive Generative Adversarial Networks o... | 2017-01-23 | Code |
| 190 | MIM | 91.5 | No | On the Importance of Normalisation Layers in Dee... | 2015-08-03 | - |
| 191 | Spectral Representations for Convolutional Neural Networks | 91.4 | No | Spectral Representations for Convolutional Neura... | 2015-06-11 | - |
| 192 | DLME (ResNet-18, linear) | 91.3 | No | DLME: Deep Local-flatness Manifold Embedding | 2022-07-07 | Code |
| 193 | RMDL (30 RDLs) | 91.21 | No | RMDL: Random Multimodel Deep Learning for Classi... | 2018-05-03 | Code |
| 194 | Network in Network | 91.2 | No | Network In Network | 2013-12-16 | Code |
| 195 | ResNet-26 (Trainable Activations) | 91.1 | No | - | - | Code |
| 196 | ResNet-32 (Trainable Activations) | 90.9 | No | - | - | Code |
| 197 | kDenseNet-BC L100 12ch | 90.83 | No | - | - | Code |
| 198 | Deep Networks with Internal Selective Attention through Feedback Connections | 90.8 | No | Deep Networks with Internal Selective Attention ... | 2014-07-11 | - |
| 199 | Maxout Network (k=2) | 90.65 | No | Maxout Networks | 2013-02-18 | Code |
| 200 | ResNet-18 | 90.65 | Yes | Knowledge Representing: Efficient, Sparse Repres... | 2019-11-13 | - |
| 201 | DNN+Probabilistic Maxout | 90.6 | Yes | Improving Deep Neural Networks with Probabilisti... | 2013-12-20 | - |
| 202 | GP EI | 90.5 | No | Practical Bayesian Optimization of Machine Learn... | 2012-06-13 | Code |
| 203 | ResNet-44 (Trainable Activations) | 90.5 | No | - | - | Code |
| 204 | ResNet-20 (Trainable Activations) | 90.4 | No | - | - | Code |
| 205 | SEER (RegNet10B) | 90 | No | Vision Models Are More Robust And Fair When Pret... | 2022-02-16 | Code |
| 206 | kMobileNet 16ch | 89.81 | No | - | - | Code |
| 207 | APAC | 89.7 | No | APAC: Augmented PAttern Classification with Neur... | 2015-05-13 | - |
| 208 | ensemble of 7 models | 89.4 | No | Dynamic Routing Between Capsules | 2017-10-26 | Code |
| 209 | DCNN+GFE | 89.1 | No | Deep Convolutional Neural Networks as Generic Fe... | 2017-10-06 | - |
| 210 | DCNN | 89 | No | - | - | Code |
| 211 | ResNet-14 (Trainable Activations) | 89 | No | - | - | Code |
| 212 | MCDNN | 88.8 | No | Multi-column Deep Neural Networks for Image Clas... | 2012-02-13 | Code |
| 213 | RReLU | 88.8 | No | Empirical Evaluation of Rectified Activations in... | 2015-05-05 | Code |
| 214 | ResNet-56 (Trainable Activations) | 88.8 | No | - | - | Code |
| 215 | F-DENSER++ | 88.73 | No | Fast-DENSER++: Evolving Fully-Trained Deep Artif... | 2019-05-08 | - |
| 216 | Diffusion Classifier (zero-shot) | 88.5 | No | Your Diffusion Model is Secretly a Zero-Shot Cla... | 2023-03-28 | Code |
| 217 | ReNet | 87.7 | Yes | ReNet: A Recurrent Neural Network Based Alternat... | 2015-05-03 | Code |
| 218 | OnDev-LCT-8/3 | 87.65 | No | OnDev-LCT: On-Device Lightweight Convolutional T... | 2024-01-22 | - |
| 219 | OnDev-LCT-4/3 | 87.03 | No | OnDev-LCT: On-Device Lightweight Convolutional T... | 2024-01-22 | - |
| 220 | TripleNet-B | 87.03 | No | Efficient Convolutional Neural Networks on Raspb... | 2022-04-02 | Code |
| 221 | An Analysis of Unsupervised Pre-training in Light of Recent Advances | 86.7 | No | An Analysis of Unsupervised Pre-training in Ligh... | 2014-12-20 | Code |
| 222 | ThreshNet95 | 86.69 | No | ThreshNet: An Efficient DenseNet Using Threshold... | 2022-01-09 | Code |
| 223 | OnDev-LCT-8/1 | 86.64 | No | OnDev-LCT: On-Device Lightweight Convolutional T... | 2024-01-22 | - |
| 224 | ShortNet1-53 | 86.64 | No | Connection Reduction of DenseNet for Image Recog... | 2022-08-02 | Code |
| 225 | OnDev-LCT-4/1 | 86.61 | No | OnDev-LCT: On-Device Lightweight Convolutional T... | 2024-01-22 | - |
| 226 | CNN+ Wilson-Cowan model RNN | 86.59 | No | Learning in Wilson-Cowan model for metapopulation | 2024-06-24 | Code |
| 227 | ResNet-8 (Trainable Activations) | 86.5 | No | - | - | Code |
| 228 | ThresholdNet | 86.34 | No | New Pruning Method Based on DenseNet Network for... | 2021-08-28 | - |
| 229 | OnDev-LCT-2/1 | 86.27 | No | OnDev-LCT: On-Device Lightweight Convolutional T... | 2024-01-22 | - |
| 230 | OnDev-LCT-2/3 | 86.04 | No | OnDev-LCT: On-Device Lightweight Convolutional T... | 2024-01-22 | - |
| 231 | OnDev-LCT-1/3 | 85.73 | No | OnDev-LCT: On-Device Lightweight Convolutional T... | 2024-01-22 | - |
| 232 | cvpr_class | 85.28 | No | ResNet strikes back: An improved training proced... | 2021-10-01 | Code |
| 233 | WaveMix | 85.21 | No | - | - | Code |
| 234 | Stochastic Pooling | 84.9 | No | Stochastic Pooling for Regularization of Deep Co... | 2013-01-16 | Code |
| 235 | OnDev-LCT-1/1 | 84.55 | No | OnDev-LCT: On-Device Lightweight Convolutional T... | 2024-01-22 | - |
| 236 | Improving neural networks by preventing co-adaptation of feature detectors | 84.4 | No | Improving neural networks by preventing co-adapt... | 2012-07-03 | Code |
| 237 | CCN | 83.36 | Yes | Vision Xformers: Efficient Attention for Image C... | 2021-07-05 | Code |
| 238 | CvN | 83.26 | Yes | Vision Xformers: Efficient Attention for Image C... | 2021-07-05 | Code |
| 239 | UL-Hopfield (ULH) | 83.1 | No | Unsupervised Learning using Pretrained CNN and A... | 2018-05-02 | - |
| 240 | DCGAN | 82.8 | Yes | Unsupervised Representation Learning with Deep C... | 2015-11-19 | Code |
| 241 | TM Composites Toolbox | 82.8 | Yes | An Optimized Toolbox for Advanced Image Processi... | 2024-06-02 | Code |
| 242 | CKN | 82.2 | No | Convolutional Kernel Networks | 2014-06-12 | - |
| 243 | The Analog Activation Function | 82.06 | No | - | - | Code |
| 244 | Discriminative Unsupervised Feature Learning with Convolutional Neural Networks | 82 | No | - | - | Code |
| 245 | Sign-symmetry | 80.98 | Yes | How Important is Weight Symmetry in Backpropagat... | 2015-10-17 | Code |
| 246 | pFedBreD_ns_mg | 80.63 | No | Personalized Federated Learning with Hidden Info... | 2022-11-19 | - |
| 247 | 1 Layer K-means | 80.6 | No | Unsupervised Representation Learning with Deep C... | 2015-11-19 | Code |
| 248 | APVT | 80.45 | No | Aggregated Pyramid Vision Transformer: Split-tra... | 2022-03-02 | - |
| 249 | Learning with Recursive Perceptual Representations | 79.7 | No | - | - | - |
| 250 | LeViP | 79.5 | No | Vision Xformers: Efficient Attention for Image C... | 2021-07-05 | Code |
| 251 | Convolutional Deep Belief Network | 78.9 | No | - | - | - |
| 252 | PCANet | 78.7 | No | PCANet: A Simple Deep Learning Baseline for Imag... | 2014-04-14 | Code |
| 253 | Hybrid ViT+RoPE | 76.9 | No | Vision Xformers: Efficient Attention for Image C... | 2021-07-05 | Code |
| 254 | FLSCNN | 75.9 | No | Enhanced Image Classification With a Fast-Learni... | 2015-03-16 | - |
| 255 | Hybrid Vision Nystromformer (ViN) | 75.26 | No | Vision Xformers: Efficient Attention for Image C... | 2021-07-05 | Code |
| 256 | CTM Drop Clause | 75.1 | No | Drop Clause: Enhancing Performance, Interpretabi... | 2021-05-30 | Code |
| 257 | Hybrid PiN | 74 | No | Vision Xformers: Efficient Attention for Image C... | 2021-07-05 | Code |
| 258 | SmoothNetV1 | 73.5 | No | SmoothNets: Optimizing CNN architecture design f... | 2022-05-09 | Code |
| 259 | SNN | 68.3 | No | Sneaky Spikes: Uncovering Stealthy Backdoor Atta... | 2023-02-13 | Code |
| 260 | Vision Nystromformer (ViN) | 65.06 | No | Vision Xformers: Efficient Attention for Image C... | 2021-07-05 | Code |
| 261 | ANODE | 60.6 | No | Augmented Neural ODEs | 2019-04-02 | Code |