| 1 | EffNet-L2 (SAM) | 96.08 | Yes | Sharpness-Aware Minimization for Efficiently Imp... | 2020-10-03 | Code |
| 2 | Swin-L + ML-Decoder | 95.1 | Yes | ML-Decoder: Scalable and Versatile Classificatio... | 2021-11-25 | Code |
| 3 | µ2Net (ViT-L/16) | 94.95 | Yes | An Evolutionary Approach to Dynamic Introduction... | 2022-05-25 | Code |
| 4 | ViT-B-16 (ImageNet-21K-P pretrain) | 94.2 | Yes | ImageNet-21K Pretraining for the Masses | 2021-04-22 | Code |
| 5 | CvT-W24 | 94.09 | Yes | CvT: Introducing Convolutions to Vision Transfor... | 2021-03-29 | Code |
| 6 | ViT-B/16 (PUGD) | 93.95 | Yes | Perturbated Gradients Updating within Unit Space... | 2021-10-01 | Code |
| 7 | Heinsen Routing + BEiT-large 16 224 | 93.8 | Yes | An Algorithm for Routing Vectors in Sequences | 2022-11-20 | Code |
| 8 | BiT-L (ResNet) | 93.51 | Yes | Big Transfer (BiT): General Visual Representatio... | 2019-12-24 | Code |
| 9 | VIT-L/16 (Spinal FC, Background) | 93.31 | No | Reduction of Class Activation Uncertainty with B... | 2023-05-05 | Code |
| 10 | CaiT-M-36 U 224 | 93.1 | Yes | Going deeper with Image Transformers | 2021-03-31 | Code |
| 11 | ViT-L (attn fine-tune) | 93 | Yes | Three things everyone should know about Vision T... | 2022-03-18 | Code |
| 12 | TResNet-L-V2 | 92.6 | Yes | TResNet: High Performance GPU-Dedicated Architec... | 2020-03-30 | Code |
| 13 | EfficientNetV2-L | 92.3 | Yes | EfficientNetV2: Smaller Models and Faster Training | 2021-04-01 | Code |
| 14 | EfficientNetV2-M | 92.2 | Yes | EfficientNetV2: Smaller Models and Faster Training | 2021-04-01 | Code |
| 15 | BiT-M (ResNet) | 92.17 | Yes | Big Transfer (BiT): General Visual Representatio... | 2019-12-24 | Code |
| 16 | CeiT-S | 91.8 | Yes | Incorporating Convolution Designs into Visual Tr... | 2021-03-22 | Code |
| 17 | CeiT-S (384 finetune resolution) | 91.8 | Yes | Incorporating Convolution Designs into Visual Tr... | 2021-03-22 | Code |
| 18 | EfficientNet-B7 | 91.7 | Yes | EfficientNet: Rethinking Model Scaling for Convo... | 2019-05-28 | Code |
| 19 | EfficientNetV2-S | 91.5 | Yes | EfficientNetV2: Smaller Models and Faster Training | 2021-04-01 | Code |
| 20 | GPIPE | 91.3 | Yes | GPipe: Efficient Training of Giant Neural Networ... | 2018-11-16 | Code |
| 21 | TNT-B | 91.1 | Yes | Transformer in Transformer | 2021-02-27 | Code |
| 22 | DeiT-B | 90.8 | Yes | Training data-efficient image transformers & dis... | 2020-12-23 | Code |
| 23 | GFNet-H-B | 90.3 | Yes | Global Filter Networks for Image Classification | 2021-07-01 | Code |
| 24 | E2E-3M | 90.27 | Yes | Rethinking Recurrent Neural Networks and Other I... | 2020-07-30 | Code |
| 25 | Bamboo (ViT-B/16) | 90.2 | Yes | Bamboo: Building Mega-Scale Vision Dataset Conti... | 2022-03-15 | Code |
| 26 | PyramidNet-272 (ASAM) | 89.9 | No | ASAM: Adaptive Sharpness-Aware Minimization for ... | 2021-02-23 | Code |
| 27 | PyramidNet (SAM) | 89.7 | No | Sharpness-Aware Minimization for Efficiently Imp... | 2020-10-03 | Code |
| 28 | DVT (T2T-ViT-24) | 89.63 | Yes | Not All Images are Worth 16x16 Words: Dynamic Tr... | 2021-05-31 | Code |
| 29 | ResMLP-24 | 89.5 | Yes | ResMLP: Feedforward networks for image classific... | 2021-05-07 | Code |
| 30 | PyramidNet-272, S=4 | 89.46 | Yes | Towards Better Accuracy-efficiency Trade-offs: D... | 2020-11-30 | Code |
| 31 | CeiT-T | 89.4 | Yes | Incorporating Convolution Designs into Visual Tr... | 2021-03-22 | Code |
| 32 | PyramidNet+ShakeDrop | 89.3 | Yes | AutoAugment: Learning Augmentation Policies from... | 2018-05-24 | Code |
| 33 | ViT-B/16- SAM | 89.1 | Yes | When Vision Transformers Outperform ResNets with... | 2021-06-03 | Code |
| 34 | ConvMLP-M | 89.1 | Yes | ConvMLP: Hierarchical Convolutional MLPs for Vis... | 2021-09-09 | Code |
| 35 | ConvMLP-L | 88.6 | Yes | ConvMLP: Hierarchical Convolutional MLPs for Vis... | 2021-09-09 | Code |
| 36 | ResNet-152x4-AGC (ImageNet-21K) | 88.54 | Yes | Effect of Pre-Training Scale on Intra- and Inter... | 2021-05-31 | Code |
| 37 | ColorNet | 88.4 | Yes | ColorNet: Investigating the importance of color ... | 2019-02-01 | Code |
| 38 | PyramidNet+ShakeDrop (Fast AA) | 88.3 | Yes | Fast AutoAugment | 2019-05-01 | Code |
| 39 | NAT-M4 | 88.3 | Yes | Neural Architecture Transfer | 2020-05-12 | Code |
| 40 | CeiT-T (384 finetune resolution) | 88 | Yes | Incorporating Convolution Designs into Visual Tr... | 2021-03-22 | Code |
| 41 | NAT-M3 | 87.7 | Yes | Neural Architecture Transfer | 2020-05-12 | Code |
| 42 | ViT-S/16- SAM | 87.6 | Yes | When Vision Transformers Outperform ResNets with... | 2021-06-03 | Code |
| 43 | NAT-M2 | 87.5 | Yes | Neural Architecture Transfer | 2020-05-12 | Code |
| 44 | Dynamics 1 | 87.48 | Yes | PSO-Convolutional Neural Networks with Heterogen... | 2022-05-20 | Code |
| 45 | DenseNet-BC-190, S=4 | 87.44 | Yes | Towards Better Accuracy-efficiency Trade-offs: D... | 2020-11-30 | Code |
| 46 | ConvMLP-S | 87.4 | Yes | ConvMLP: Hierarchical Convolutional MLPs for Vis... | 2021-09-09 | Code |
| 47 | ResMLP-12 | 87 | Yes | ResMLP: Feedforward networks for image classific... | 2021-05-07 | Code |
| 48 | WRN-40-10, S=4 | 86.9 | Yes | Towards Better Accuracy-efficiency Trade-offs: D... | 2020-11-30 | Code |
| 49 | ResNet50 (A1) | 86.9 | Yes | ResNet strikes back: An improved training proced... | 2021-10-01 | Code |
| 50 | WRN-28-10 * 3 | 86.81 | Yes | MixMo: Mixing Multiple Inputs for Multiple Outpu... | 2021-03-10 | Code |
| 51 | PyramidNet + AA (AMP) | 86.64 | Yes | Regularizing Neural Networks via Adversarial Mod... | 2020-10-10 | Code |
| 52 | PyramidNet-200 + Shakedrop + Cutmix + PS-KD | 86.41 | Yes | Self-Knowledge Distillation with Progressive Ref... | 2020-06-22 | Code |
| 53 | Mixer-B/16- SAM | 86.4 | Yes | When Vision Transformers Outperform ResNets with... | 2021-06-03 | Code |
| 54 | ResCNet-50 | 86.31 | No | Deep Feature Response Discriminative Calibration | 2024-11-16 | Code |
| 55 | PyramidNet-200 + Shakedrop + Cutmix | 86.19 | Yes | CutMix: Regularization Strategy to Train Strong ... | 2019-05-13 | Code |
| 56 | MUXNet-m | 86.1 | Yes | MUXConv: Information Multiplexing in Convolution... | 2020-03-31 | Code |
| 57 | NAT-M1 | 86 | Yes | Neural Architecture Transfer | 2020-05-12 | Code |
| 58 | WRN-28-10 | 85.77 | Yes | MixMo: Mixing Multiple Inputs for Multiple Outpu... | 2021-03-10 | Code |
| 59 | WRN-28-10, S=4 | 85.74 | Yes | Towards Better Accuracy-efficiency Trade-offs: D... | 2020-11-30 | Code |
| 60 | WRN-28-8 (SAMix+DM) | 85.59 | No | - | - | - |
| 61 | WRN-28-8 +SAMix | 85.5 | Yes | Boosting Discriminative Visual Representation Le... | 2021-11-30 | Code |
| 62 | ASANas | 85.42 | Yes | Improving Neural Architecture Search Image Class... | 2019-03-14 | Code |
| 63 | WRN-28-8 (AutoMix+DM) | 85.38 | No | - | - | - |
| 64 | SparseSwin | 85.35 | Yes | SparseSwin: Swin Transformer with Sparse Transfo... | 2023-09-11 | Code |
| 65 | WRN-28-8 (PuzzleMix+DM) | 85.25 | No | - | - | - |
| 66 | ResNet-50-SAM | 85.2 | Yes | When Vision Transformers Outperform ResNets with... | 2021-06-03 | Code |
| 67 | WRN-28-8 +AutoMix | 85.16 | Yes | AutoMix: Unveiling the Power of Mixup for Strong... | 2021-03-24 | Code |
| 68 | WaveMixLite-256/7 | 85.09 | Yes | WaveMix: A Resource-efficient Neural Network for... | 2022-05-28 | Code |
| 69 | MANO-tiny | 85.08 | Yes | Linear Attention with Global Context: A Multipol... | 2025-07-03 | Code |
| 70 | WRN 28-14 | 85 | Yes | Neural networks with late-phase weights | 2020-07-25 | Code |
| 71 | R-Mix (WideResNet 28-10) | 85 | Yes | Expeditious Saliency-guided Mix-up through Rando... | 2022-12-09 | Code |
| 72 | EEEA-Net-C (b=5)+ CO | 84.98 | Yes | EEEA-Net: An Early Exit Evolutionary Neural Arch... | 2021-08-13 | Code |
| 73 | RL-Mix (WideResNet 28-10) | 84.9 | Yes | Expeditious Saliency-guided Mix-up through Rando... | 2022-12-09 | Code |
| 74 | Wide-ResNet-28-10 | 84.89 | Yes | Automatic Data Augmentation via Invariance-Const... | 2022-09-29 | Code |
| 75 | SENet + ShakeEven + Cutout | 84.59 | Yes | Squeeze-and-Excitation Networks | 2017-09-05 | Code |
| 76 | ResNeXt-50(32x4d) + SAMix | 84.42 | Yes | Boosting Discriminative Visual Representation Le... | 2021-11-30 | Code |
| 77 | WRN-28-10 with reSGHMC | 84.38 | Yes | Non-convex Learning via Replica Exchange Stochas... | 2020-08-12 | Code |
| 78 | PyramidNet-272 + SWA | 84.16 | Yes | Averaging Weights Leads to Wider Optima and Bett... | 2018-03-14 | Code |
| 79 | WRN28-10 | 84.05 | Yes | Puzzle Mix: Exploiting Saliency and Local Statis... | 2020-09-15 | Code |
| 80 | HCGNet-A3 | 84.04 | Yes | Gated Convolutional Networks with Hybrid Connect... | 2019-08-26 | Code |
| 81 | WideResNet 28-10 + CutMix (OneCycleLR scheduler) | 83.97 | Yes | Expeditious Saliency-guided Mix-up through Rando... | 2022-12-09 | Code |
| 82 | DenseNet-BC-190 + FMix | 83.95 | Yes | FMix: Enhancing Mixed Sample Data Augmentation | 2020-02-27 | Code |
| 83 | ORN | 83.85 | Yes | Oriented Response Networks | 2017-01-07 | Code |
| 84 | Grafit (ResNet-50) | 83.7 | Yes | Grafit: Learning fine-grained image representati... | 2020-11-25 | - |
| 85 | ResNeXt-50(32x4d) + AutoMix | 83.64 | Yes | AutoMix: Unveiling the Power of Mixup for Strong... | 2021-03-24 | Code |
| 86 | CCT-7/3x1+HTM+VTM | 83.57 | Yes | TokenMixup: Efficient Attention-guided Token-lev... | 2022-10-14 | Code |
| 87 | HCGNet-A2 | 83.46 | Yes | Gated Convolutional Networks with Hybrid Connect... | 2019-08-26 | Code |
| 88 | Res2NeXt-29 | 83.44 | Yes | Res2Net: A New Multi-scale Backbone Architecture | 2019-04-02 | Code |
| 89 | DenseNet-BC-190 + Mixup | 83.2 | Yes | mixup: Beyond Empirical Risk Minimization | 2017-10-25 | Code |
| 90 | SSAL-DenseNet 190-40 | 83.2 | Yes | Contextual Classification Using Self-Supervised ... | 2021-01-07 | Code |
| 91 | EnAET | 83.13 | Yes | EnAET: A Self-Trained framework for Semi-Supervi... | 2019-11-21 | Code |
| 92 | WRN 28-10 | 83.06 | Yes | Neural networks with late-phase weights | 2020-07-25 | Code |
| 93 | R-Mix (ResNeXt 29-4-24) | 83.02 | Yes | Expeditious Saliency-guided Mix-up through Rando... | 2022-12-09 | Code |
| 94 | Wide ResNet+Cutout+no BN scale/offset learning | 82.95 | Yes | Single-bit-per-weight deep convolutional neural ... | 2019-07-16 | Code |
| 95 | WRN-16-8 with reSGHMC | 82.95 | Yes | Non-convex Learning via Replica Exchange Stochas... | 2020-08-12 | Code |
| 96 | DenseNet-BC | 82.82 | Yes | Densely Connected Convolutional Networks | 2016-08-25 | Code |
| 97 | ABNet-2G-R3-Combined | 82.784 | No | ANDHRA Bandersnatch: Training Neural Networks to... | 2024-11-28 | Code |
| 98 | CCT-7/3x1* | 82.72 | Yes | Escaping the Big Data Paradigm with Compact Tran... | 2021-04-12 | Code |
| 99 | EXACT (WRN-28-10) | 82.68 | No | EXACT: How to Train Your Accuracy | 2022-05-19 | Code |
| 100 | SKNet-29 (ResNeXt-29, 16×32d) | 82.67 | Yes | Selective Kernel Networks | 2019-03-15 | Code |
| 101 | DenseNet | 82.62 | Yes | Densely Connected Convolutional Networks | 2016-08-25 | Code |
| 102 | Shared WRN | 82.57 | Yes | Learning Implicitly Recurrent CNNs Through Param... | 2019-02-26 | Code |
| 103 | Transformer local-attention (NesT-B) | 82.56 | Yes | Nested Hierarchical Transformer: Towards Accurat... | 2021-05-26 | Code |
| 104 | RL-Mix (ResNeXt 29-4-24) | 82.43 | Yes | Expeditious Saliency-guided Mix-up through Rando... | 2022-12-09 | Code |
| 105 | Mixer-S/16- SAM | 82.4 | Yes | When Vision Transformers Outperform ResNets with... | 2021-06-03 | Code |
| 106 | R-Mix (WideResNet 16-8) | 82.32 | Yes | Expeditious Saliency-guided Mix-up through Rando... | 2022-12-09 | Code |
| 107 | ResNeXt 29-4-24 + CutMix (OneCycleLR scheduler) | 82.3 | Yes | Expeditious Saliency-guided Mix-up through Rando... | 2022-12-09 | Code |
| 108 | WARN | 82.18 | Yes | Attend and Rectify: a Gated Attention Mechanism ... | 2018-07-19 | Code |
| 109 | RL-Mix (WideResNet 16-8) | 82.16 | Yes | Expeditious Saliency-guided Mix-up through Rando... | 2022-12-09 | Code |
| 110 | WRN+SWA | 82.15 | Yes | Averaging Weights Leads to Wider Optima and Bett... | 2018-03-14 | Code |
| 111 | Manifold Mixup | 81.96 | Yes | Manifold Mixup: Better Representations by Interp... | 2018-06-13 | Code |
| 112 | HCGNet-A1 | 81.87 | Yes | Gated Convolutional Networks with Hybrid Connect... | 2019-08-26 | Code |
| 113 | WideResNet 16-8 + CutMix (OneCycleLR scheduler) | 81.79 | Yes | Expeditious Saliency-guided Mix-up through Rando... | 2022-12-09 | Code |
| 114 | Residual Gates + WRN | 81.73 | Yes | Learning Identity Mappings with Residual Gates | 2016-11-04 | - |
| 115 | kNN-CLIP | 81.7 | Yes | Revisiting a kNN-based Image Classification Syst... | 2022-04-03 | - |
| 116 | AA-Wide-ResNet | 81.6 | Yes | Attention Augmented Convolutional Networks | 2019-04-22 | Code |
| 117 | PDO-eConv (p8, 4.6M) | 81.6 | Yes | PDO-eConvs: Partial Differential Operator Based ... | 2020-07-20 | Code |
| 118 | SEER (RegNet10B) | 81.53 | Yes | Vision Models Are More Robust And Fair When Pret... | 2022-02-16 | Code |
| 119 | R-Mix (PreActResNet-18) | 81.49 | Yes | Expeditious Saliency-guided Mix-up through Rando... | 2022-12-09 | Code |
| 120 | ResNet50 (FSGDM) | 81.44 | No | On the Performance Analysis of Momentum Method: ... | 2024-11-29 | Code |
| 121 | Wide-ResNet-40-2 | 81.19 | Yes | Automatic Data Augmentation via Invariance-Const... | 2022-09-29 | Code |
| 122 | Wide ResNet | 81.15 | Yes | Wide Residual Networks | 2016-05-23 | Code |
| 123 | CoPaNet-R-164 | 81.1 | Yes | Deep Competitive Pathway Networks | 2017-09-29 | Code |
| 124 | ABNet-2G-R3 | 80.83 | No | ANDHRA Bandersnatch: Training Neural Networks to... | 2024-11-28 | Code |
| 125 | RL-Mix (PreActResNet-18) | 80.75 | Yes | Expeditious Saliency-guided Mix-up through Rando... | 2022-12-09 | Code |
| 126 | PreActResNet-18 + CutMix (OneCycleLR scheduler) | 80.6 | Yes | Expeditious Saliency-guided Mix-up through Rando... | 2022-12-09 | Code |
| 127 | GAC-SNN | 80.45 | No | Gated Attention Coding for Training High-perform... | 2023-08-12 | Code |
| 128 | ABNet-2G-R2 | 80.354 | No | ANDHRA Bandersnatch: Training Neural Networks to... | 2024-11-28 | Code |
| 129 | SimpleNetv2 | 80.29 | Yes | Towards Principled Design of Deep Convolutional ... | 2018-02-17 | Code |
| 130 | UPANets | 80.29 | Yes | UPANets: Learning from the Universal Pixel Atten... | 2021-03-15 | Code |
| 131 | PreActResNet-18 + SageMix | 80.16 | Yes | SageMix: Saliency-Guided Mixup for Point Clouds | 2022-10-13 | Code |
| 132 | ResNet56 with reSGHMC | 80.14 | Yes | Non-convex Learning via Replica Exchange Stochas... | 2020-08-12 | Code |
| 133 | PDO-eConv (p8, 2.62M) | 79.99 | Yes | PDO-eConvs: Partial Differential Operator Based ... | 2020-07-20 | Code |
| 134 | VGG11B(3x) + LocalLearning | 79.9 | Yes | Training Neural Networks with Local Error Signals | 2019-01-20 | Code |
| 135 | NNCLR | 79 | Yes | With a Little Help from My Friends: Nearest-Neig... | 2021-04-29 | Code |
| 136 | ABNet-2G-R1 | 78.792 | No | ANDHRA Bandersnatch: Training Neural Networks to... | 2024-11-28 | Code |
| 137 | PreActResNet18 (AMP) | 78.49 | Yes | Regularizing Neural Networks via Adversarial Mod... | 2020-10-10 | Code |
| 138 | SimpleNetv1 | 78.37 | Yes | Lets keep it simple, Using simple architectures ... | 2016-08-22 | Code |
| 139 | ViT (lightweight, MAE pre-trained) | 78.27 | No | Pre-training of Lightweight Vision Transformers ... | 2024-02-06 | - |
| 140 | PDC | 77.9 | Yes | Augmenting Deep Classifiers with Polynomial Neur... | 2021-04-16 | Code |
| 141 | MobileNetV3-large x1.0 (BSConv-U) | 77.7 | Yes | Rethinking Depthwise Separable Convolutions: How... | 2020-03-30 | Code |
| 142 | CCT-6/3x1 | 77.31 | Yes | Escaping the Big Data Paradigm with Compact Tran... | 2021-04-12 | Code |
| 143 | ResNet-1001 | 77.3 | Yes | Identity Mappings in Deep Residual Networks | 2016-03-16 | Code |
| 144 | Evolution | 77 | Yes | Large-Scale Evolution of Image Classifiers | 2017-03-03 | Code |
| 145 | DIANet | 76.98 | Yes | DIANet: Dense-and-Implicit Attention Network | 2019-05-25 | Code |
| 146 | LP-BNN (ours) + cutout | 76.85 | Yes | Encoding the latent posterior of Bayesian Neural... | 2020-12-04 | Code |
| 147 | ResNet-18+MM+FRL | 76.64 | Yes | Learning Class Unique Features in Fine-Grained V... | 2020-11-22 | - |
| 148 | ResNet32 with reSGHMC | 76.55 | Yes | Non-convex Learning via Replica Exchange Stochas... | 2020-08-12 | Code |
| 149 | MomentumNet | 76.38 | Yes | Momentum Residual Neural Networks | 2021-02-15 | Code |
| 150 | SSCNN | 75.7 | Yes | Spatially-sparse convolutional neural networks | 2014-09-22 | Code |
| 151 | Exponential Linear Units | 75.7 | Yes | Fast and Accurate Deep Network Learning by Expon... | 2015-11-23 | Code |
| 152 | ResNet-9 | 75.59 | Yes | CNN Filter DB: An Empirical Investigation of Tra... | 2022-03-29 | Code |
| 153 | Stochastic Depth | 75.42 | Yes | Deep Networks with Stochastic Depth | 2016-03-30 | Code |
| 154 | ResNet v2-110 (Mish activation) | 74.41 | Yes | Mish: A Self Regularized Non-Monotonic Activatio... | 2019-08-23 | Code |
| 155 | Dspike (ResNet-18) | 74.24 | No | - | - | - |
| 156 | ResNet20 with reSGHMC | 74.14 | Yes | Non-convex Learning via Replica Exchange Stochas... | 2020-08-12 | Code |
| 157 | MixMatch | 74.1 | Yes | MixMatch: A Holistic Approach to Semi-Supervised... | 2019-05-06 | Code |
| 158 | Beta-Rank | 74.01 | No | Beta-Rank: A Robust Convolutional Filter Pruning... | 2023-04-15 | Code |
| 159 | PreResNet-110 | 73.98 | No | How to Use Dropout Correctly on Residual Network... | 2023-02-13 | Code |
| 160 | ABNet-2G-R0 | 73.93 | No | ANDHRA Bandersnatch: Training Neural Networks to... | 2024-11-28 | Code |
| 161 | Fractional MP | 73.6 | Yes | Fractional Max-Pooling | 2014-12-18 | Code |
| 162 | ResNet+ELU | 73.5 | No | Deep Residual Networks with Exponential Linear U... | 2016-04-14 | Code |
| 163 | PDO-eConv (p6m,0.37M) | 73 | Yes | PDO-eConvs: Partial Differential Operator Based ... | 2020-07-20 | Code |
| 164 | SOPCNN | 72.96 | Yes | Stochastic Optimization of Plain Convolutional N... | 2020-01-24 | Code |
| 165 | PDO-eConv (p6,0.36M) | 72.87 | Yes | PDO-eConvs: Partial Differential Operator Based ... | 2020-07-20 | Code |
| 166 | Tuned CNN | 72.6 | Yes | Scalable Bayesian Optimization Using Deep Neural... | 2015-02-19 | Code |
| 167 | ResNet-110 (SAP) | 72.537 | No | Stochastic Subsampling With Average Pooling | 2024-09-25 | - |
| 168 | CMsC | 72.4 | Yes | Competitive Multi-scale Convolution | 2015-11-18 | - |
| 169 | Fitnet4-LSUV | 72.3 | Yes | All you need is a good init | 2015-11-19 | Code |
| 170 | GAN+ResNet | 71.52 | No | - | - | Code |
| 171 | kMobileNet V3 Large 16ch | 71.36 | Yes | - | - | Code |
| 172 | BNM NiN | 71.1 | Yes | Batch-normalized Maxout Network in Network | 2015-11-09 | Code |
| 173 | OTTT | 71.05 | No | Online Training Through Time for Spiking Neural ... | 2022-10-09 | Code |
| 174 | MIM | 70.8 | Yes | On the Importance of Normalisation Layers in Dee... | 2015-08-03 | - |
| 175 | WaveMix-Lite-256/7 | 70.2 | No | WaveMix: A Resource-efficient Neural Network for... | 2022-05-28 | Code |
| 176 | IM-Loss (VGG-16) | 70.18 | No | - | - | - |
| 177 | NiN+APL | 69.2 | Yes | Learning Activation Functions to Improve Deep Ne... | 2014-12-21 | Code |
| 178 | SWWAE | 69.1 | Yes | Stacked What-Where Auto-encoders | 2015-06-08 | Code |
| 179 | NiN+Superclass+CDJ | 69 | Yes | Deep Convolutional Decision Jungle for Image Cla... | 2017-06-06 | - |
| 180 | Spectral Representations for Convolutional Neural Networks | 68.4 | No | Spectral Representations for Convolutional Neura... | 2015-06-11 | - |
| 181 | ReActNet-18 | 68.34 | No | "BNN - BN = ?": Training Binary Neural Networks ... | 2021-04-16 | Code |
| 182 | VDN | 67.8 | No | Training Very Deep Networks | 2015-07-22 | Code |
| 183 | DCNN+GFE | 67.7 | No | Deep Convolutional Neural Networks as Generic Fe... | 2017-10-06 | - |
| 184 | Tree+Max-Avg pooling | 67.6 | No | Generalizing Pooling Functions in Convolutional ... | 2015-09-30 | Code |
| 185 | HD-CNN | 67.4 | No | HD-CNN: Hierarchical Deep Convolutional Neural N... | 2014-10-03 | Code |
| 186 | Universum Prescription | 67.2 | No | Universum Prescription: Regularization using Unl... | 2015-11-11 | - |
| 187 | ResNet50 Without Transfer Learning | 67.06 | No | - | - | Code |
| 188 | AlexNet (KP) | 66.78 | No | - | - | - |
| 189 | ACN | 66.3 | No | Striving for Simplicity: The All Convolutional Net | 2014-12-21 | Code |
| 190 | DLME (ResNet-18, linear) | 66.1 | No | DLME: Deep Local-flatness Manifold Embedding | 2022-07-07 | Code |
| 191 | ResNet-18 (modified) | 66 | No | FatNet: High Resolution Kernels for Classificati... | 2022-10-30 | Code |
| 192 | DSN | 65.4 | No | Deeply-Supervised Nets | 2014-09-18 | Code |
| 193 | NiN | 64.3 | No | Network In Network | 2013-12-16 | Code |
| 194 | Tree Priors | 63.2 | No | - | - | - |
| 195 | DNN+Probabilistic Maxout | 61.9 | No | Improving Deep Neural Networks with Probabilisti... | 2013-12-20 | - |
| 196 | Maxout Network (k=2) | 61.43 | No | Maxout Networks | 2013-02-18 | Code |
| 197 | ResNet20+UnsharpMaskLayer | 60.36 | No | - | - | Code |
| 198 | Convolutional Linear Transformer for Vision (CLTV) | 60.11 | No | Convolutional Xformers for Vision | 2022-01-25 | Code |
| 199 | FatNet of ResNet-18 | 60 | No | FatNet: High Resolution Kernels for Classificati... | 2022-10-30 | Code |
| 200 | Optical Simulation of FatNet | 60 | No | FatNet: High Resolution Kernels for Classificati... | 2022-10-30 | Code |
| 201 | RReLU | 59.8 | No | Empirical Evaluation of Rectified Activations in... | 2015-05-05 | Code |
| 202 | Stochastic Pooling | 57.5 | No | Stochastic Pooling for Regularization of Deep Co... | 2013-01-16 | Code |
| 203 | Sign-symmetry | 48.75 | No | How Important is Weight Symmetry in Backpropagat... | 2015-10-17 | Code |
| 204 | AlexNet (DFA) | 48.03 | No | - | - | - |
| 205 | CNN39 | 42.64 | No | Sharpness-Aware Minimization for Efficiently Imp... | 2020-10-03 | Code |
| 206 | CNN36 | 36.07 | No | Sharpness-Aware Minimization for Efficiently Imp... | 2020-10-03 | Code |
| 207 | CNN37 | 35.05 | No | Sharpness-aware Quantization for Deep Neural Net... | 2021-11-24 | Code |
| 208 | AlexNet (FA) | 19.49 | No | - | - | - |