TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

SotA/Computer Vision/Image Classification/ImageNet

Image Classification on ImageNet

Metric: GFLOPs (higher is better)

LeaderboardDataset
Loading chart...

Results

Submit a result
#Model↕GFLOPs▼Extra DataPaperDate↕Code
1InternImage-H1478NoInternImage: Exploring Large-Scale Vision Founda...2022-11-10Code
2DaViT-G1038NoDaViT: Dual Attention Vision Transformers2022-04-07Code
3SWAG (ViT H/14)1018.8NoRevisiting Weakly Supervised Pre-Training of Vis...2022-01-20Code
4MViTv2-H (512 res, ImageNet-21k pretrain)763.5NoMViTv2: Improved Multiscale Vision Transformers ...2021-12-02Code
5Perceiver (FF)707.2NoPerceiver: General Perception with Iterative Att...2021-03-04Code
6MOAT-4 22K+1K648.5NoMOAT: Alternating Mobile Convolution and Attenti...2022-10-04Code
7DY-MobileNetV2 ×1.0626NoDynamic Convolution: Attention over Convolution ...2019-12-07Code
8FixEfficientNet-L2585NoFixing the train-test resolution discrepancy: Fi...2020-03-18Code
9MambaVision-L3489.1NoMambaVision: A Hybrid Mamba-Transformer Vision B...2024-07-10Code
10ELSA-VOLO-D5 (512*512)437NoELSA: Enhanced Local Self-Attention for Vision T...2021-12-23Code
11XCiT-L24417.9NoXCiT: Cross-Covariance Image Transformers2021-06-17Code
12VOLO-D5+HAT412NoImproving Vision Transformers by Revisiting High...2022-04-03Code
13VOLO-D5412NoVOLO: Vision Outlooker for Visual Recognition2021-06-24Code
14CaiT-M-48-448377.3NoGoing deeper with Image Transformers2021-03-31Code
15NFNet-F6 w/ SAM377.28NoHigh-Performance Large-Scale Image Recognition W...2021-02-11Code
16NFNet-F4+367NoHigh-Performance Large-Scale Image Recognition W...2021-02-11Code
17DaViT-H334NoDaViT: Dual Attention Vision Transformers2022-04-07Code
18ResNeXt-101 32x48d306NoExploring the Limits of Weakly Supervised Pretra...2018-05-02Code
19NFNet-F5 w/ SAM289.76NoHigh-Performance Large-Scale Image Recognition W...2021-02-11Code
20NFNet-F5289.76NoHigh-Performance Large-Scale Image Recognition W...2021-02-11Code
21MOAT-3 1K only271NoMOAT: Alternating Mobile Convolution and Attenti...2022-10-04Code
22CAIT-M36-448247.8NoGoing deeper with Image Transformers2021-03-31Code
23NFNet-F4215.24NoHigh-Performance Large-Scale Image Recognition W...2021-02-11Code
24LV-ViT-L214.8NoAll Tokens Matter: Token Labeling for Training B...2021-04-22Code
25AmoebaNet-A208NoRegularized Evolution for Image Classifier Archi...2018-02-05Code
26VOLO-D4197NoVOLO: Vision Outlooker for Visual Recognition2021-06-24Code
27ViT-L191.2NoDeiT III: Revenge of the ViT2022-04-14Code
28XCiT-M24188NoXCiT: Cross-Covariance Image Transformers2021-06-17Code
29ConvNeXt-XL (ImageNet-22k)179NoA ConvNet for the 2020s2022-01-10Code
30ResNeXt-101 32x32d174NoExploring the Limits of Weakly Supervised Pretra...2018-05-02Code
31CAIT-M-36173.3NoGoing deeper with Image Transformers2021-03-31Code
32InternImage-XL163NoInternImage: Exploring Large-Scale Vision Founda...2022-11-10Code
33FasterViT-6142NoFasterViT: Fast Vision Transformers with Hierarc...2023-06-09Code
34MViTv2-L (384 res, ImageNet-21k pretrain)140.7NoMViTv2: Improved Multiscale Vision Transformers ...2021-12-02Code
35MViTv2-L (384 res)140.2NoMViTv2: Improved Multiscale Vision Transformers ...2021-12-02Code
36RepLKNet-XL128.7NoScaling Up Your Kernels to 31x31: Revisiting Lar...2022-03-13Code
37MViTv2-H (mageNet-21k pretrain)120.6NoMViTv2: Improved Multiscale Vision Transformers ...2021-12-02Code
38CAIT-M-24116.1NoGoing deeper with Image Transformers2021-03-31Code
39NFNet-F3114.76NoHigh-Performance Large-Scale Image Recognition W...2021-02-11Code
40VAN-B6 (22K, 384res)114.3NoVisual Attention Network2022-02-20Code
41CoAtNet-3 @384114NoCoAtNet: Marrying Convolution and Attention for ...2021-06-09Code
42FasterViT-5113NoFasterViT: Fast Vision Transformers with Hierarc...2023-06-09Code
43InternImage-L108NoInternImage: Exploring Large-Scale Vision Founda...2022-11-10Code
44XCiT-S24106NoXCiT: Cross-Covariance Image Transformers2021-06-17Code
45Swin-L103.9NoSwin Transformer: Hierarchical Vision Transforme...2021-03-25Code
46DaViT-L (ImageNet-22k)103NoDaViT: Dual Attention Vision Transformers2022-04-07Code
47MogaNet-XL (384res)102NoMogaNet: Multi-order Gated Aggregation Network2022-11-07Code
48HorNet-L (GF)101.8NoHorNet: Efficient High-Order Spatial Interaction...2022-07-28Code
49DiNAT_s-Large (384res; Pretrained on IN22K@224)101.5NoDilated Neighborhood Attention Transformer2022-09-29Code
50ConvNeXt-L (384 res)101NoA ConvNet for the 2020s2022-01-10Code
51Mini-Swin-B@38498.8NoMiniViT: Compressing Vision Transformers with We...2022-04-14Code
52CSWin-L (384 res,ImageNet-22k pretrain)96.8NoCSWin Transformer: A General Vision Transformer ...2021-07-01Code
53EfficientNetV2-XL (21k)94YesEfficientNetV2: Smaller Models and Faster Training2021-04-01Code
54DiNAT-Large (11x11ks; 384res; Pretrained on IN22K@224)92.4NoDilated Neighborhood Attention Transformer2022-09-29Code
55DiNAT-Large (384x384; Pretrained on ImageNet-22K @ 224x224)89.7NoDilated Neighborhood Attention Transformer2022-09-29Code
56FixEfficientNet-B782NoFixing the train-test resolution discrepancy: Fi...2020-03-18Code
57CAFormer-B36 (384 res, 21K)72.2NoMetaFormer Baselines for Vision2022-10-24Code
58CAFormer-B36 (384 res)72.2NoMetaFormer Baselines for Vision2022-10-24Code
59ResNeXt-101 32×16d72NoExploring the Limits of Weakly Supervised Pretra...2018-05-02Code
60VOLO-D367.9NoVOLO: Vision Outlooker for Visual Recognition2021-06-24Code
61MIRL (ViT-B-48)67NoMasked Image Residual Learning for Scaling Deepe...2023-09-25Code
62ConvFormer-B36 (384 res, 21K)66.5NoMetaFormer Baselines for Vision2022-10-24Code
63ConvFormer-B36 (384 res)66.5NoMetaFormer Baselines for Vision2022-10-24Code
64CAIT-S-4863.8NoGoing deeper with Image Transformers2021-03-31Code
65NFNet-F262.59NoHigh-Performance Large-Scale Image Recognition W...2021-02-11Code
66SE-ResNeXt-101, 64x4d, S=2(416px)61.1NoTowards Better Accuracy-efficiency Trade-offs: D...2020-11-30Code
67CLCNet (S:ViT+D:VOLO-D3) (retrain)57.46NoCLCNet: Rethinking of Ensemble Modeling with Cla...2022-05-19Code
68TransNeXt-Base (IN-1K supervised, 384)56.3NoTransNeXt: Robust Foveal Visual Perception for V...2023-11-28Code
69XCiT-S1255.6NoXCiT: Cross-Covariance Image Transformers2021-06-17Code
70ResNet-RS-270 (256 image res)54NoRevisiting ResNets: Improved Training and Scalin...2021-03-13Code
71EfficientNetV2-L (21k)53NoEfficientNetV2: Smaller Models and Faster Training2021-04-01Code
72EfficientNetV2-L53NoEfficientNetV2: Smaller Models and Faster Training2021-04-01Code
73CLCNet (S:ViT+D:EffNet-B7) (retrain)51.93NoCLCNet: Rethinking of Ensemble Modeling with Cla...2022-05-19Code
74UniNet-B651NoUniNet: Unified Architecture Search with Convolu...2022-07-12Code
75Sequencer2D-L↑39250.7NoSequencer: Deep LSTM for Image Classification2022-05-04Code
76VAN-B5 (22K, 384res)50.6NoVisual Attention Network2022-02-20Code
77PNASNet-550NoProgressive Neural Architecture Search2017-12-02Code
78DAT-B (384 res, IN-1K only)49.8NoVision Transformer with Deformable Attention2022-01-03Code
79DAT-B++ (384x384)49.7NoDAT++: Spatially Dynamic Vision Transformer with...2023-09-04Code
80CAIT-S-3648NoGoing deeper with Image Transformers2021-03-31Code
81CLCNet (S:D1+D:D5)47.43NoCLCNet: Rethinking of Ensemble Modeling with Cla...2022-05-19Code
82Swin-B47NoSwin Transformer: Hierarchical Vision Transforme...2021-03-25Code
83Conformer-B46.6NoConformer: Local Features Coupling Global Repres...2021-05-09Code
84DaViT-B (ImageNet-22k)46.4NoDaViT: Dual Attention Vision Transformers2022-04-07Code
85CLCNet (S:ConvNeXt-L+D:EffNet-B7) (retrain)45.43NoCLCNet: Rethinking of Ensemble Modeling with Cla...2022-05-19Code
86MaxViT-L (224res)43.9NoMaxViT: Multi-Axis Vision Transformer2022-04-04Code
87SReT-S (512 res, ImageNet-1K only)42.8NoSliced Recursive Transformer2021-11-09Code
88CAFormer-M36 (384 res, 21K)42NoMetaFormer Baselines for Vision2022-10-24Code
89CAFormer-M36 (384 res)42NoMetaFormer Baselines for Vision2022-10-24Code
90LITv2-B|38439.7NoFast Vision Transformers with HiLo Attention2022-05-26Code
91UniFormer-L (384 res)39.2NoUniFormer: Unifying Convolution and Self-attenti...2022-01-24Code
92VAN-B6 (22K)38.9NoVisual Attention Network2022-02-20Code
93SE-ResNeXt-101, 64x4d, S=2(320px)38.2NoTowards Better Accuracy-efficiency Trade-offs: D...2020-11-30Code
94RevBiFPN-S638.1NoRevBiFPN: The Fully Reversible Bidirectional Fea...2022-06-28Code
95ConvFormer-M36 (384 res, 21K)37.7NoMetaFormer Baselines for Vision2022-10-24Code
96ConvFormer-M36 (384 res)37.7NoMetaFormer Baselines for Vision2022-10-24Code
97NoisyStudent (EfficientNet-B7)37NoSelf-training with Noisy Student improves ImageN...2019-11-11Code
98EfficientNet-B737NoEfficientNet: Rethinking Model Scaling for Convo...2019-05-28Code
99FasterViT-436.6NoFasterViT: Fast Vision Transformers with Hierarc...2023-06-09Code
100ActiveMLP-L36.4NoActive Token Mixer2022-03-11Code
101VAN-B4 (22K, 384res)35.9NoVisual Attention Network2022-02-20Code
102NFNet-F135.54NoHigh-Performance Large-Scale Image Recognition W...2021-02-11Code
103DeiT-B with iRPE-K35.368NoRethinking and Improving Relative Position Encod...2021-07-29Code
104MambaVision-L34.9NoMambaVision: A Hybrid Mamba-Transformer Vision B...2024-07-10Code
105RDNet-L (384 res)34.7NoDenseNets Reloaded: Paradigm Shift Beyond ResNet...2024-03-28Code
106RDNet-L34.7NoDenseNets Reloaded: Paradigm Shift Beyond ResNet...2024-03-28Code
107CoAtNet-334.7NoCoAtNet: Marrying Convolution and Attention for ...2021-06-09Code
108DiNAT_s-Large (224x224; Pretrained on ImageNet-22K @ 224x224)34.5NoDilated Neighborhood Attention Transformer2022-09-29Code
109T2T-ViT-14|38434.2NoTokens-to-Token ViT: Training Vision Transformer...2021-01-28Code
110MViT-B-2432.7NoMultiscale Vision Transformers2021-04-22Code
111CAIT-S-2432.2NoGoing deeper with Image Transformers2021-03-31Code
112TransNeXt-Small (IN-1K supervised, 384)32.1NoTransNeXt: Robust Foveal Visual Perception for V...2023-11-28Code
113Next-ViT-L @38432NoNext-ViT: Next Generation Vision Transformer for...2022-07-12Code
114VVT-L (384 res)31.8NoVicinity Vision Transformer2022-06-21Code
115gMLP-B31.6NoPay Attention to MLPs2021-05-17Code
116ResNeXt-101 64x431.5NoAggregated Residual Transformations for Deep Neu...2016-11-16Code
117Harm-SE-RNX-101 64x4d (320x320, Mean-Max Pooling)31.4NoHarmonic Convolutional Networks based on Discret...2020-01-18Code
118TinySaver(ConvNeXtV2_h, 0.01 Acc drop)31.17NoTiny Models are the Computational Saver for Larg...2024-03-26Code
119T2T-ViTt-2430NoTokens-to-Token ViT: Training Vision Transformer...2021-01-28Code
120ConViT-B+30NoConViT: Improving Vision Transformers with Soft ...2021-03-19Code
121CAIT-XS-3628.8NoGoing deeper with Image Transformers2021-03-31Code
122ViTAE-B-Stage27.6NoViTAE: Vision Transformer Advanced by Exploring ...2021-06-07Code
123T2T-ViT-2427.6NoTokens-to-Token ViT: Training Vision Transformer...2021-01-28Code
124TinyViT-21M-512-distill (512 res, 21k)27NoTinyViT: Fast Pretraining Distillation for Small...2022-07-21Code
125SE-CoTNetD-15226.5NoContextual Transformer Networks for Visual Recog...2021-07-26Code
126CAFormer-S36 (384 res, 21K)26NoMetaFormer Baselines for Vision2022-10-24Code
127CAFormer-S36 (384 res)26NoMetaFormer Baselines for Vision2022-10-24Code
128CvT-21 (384 res, ImageNet-22k pretrain)25NoCvT: Introducing Convolutions to Vision Transfor...2021-03-29Code
129CvT-21 (384 res)24.9NoCvT: Introducing Convolutions to Vision Transfor...2021-03-29Code
130ResMLP-B24 + STD24.1No--Code
131EfficientNetV2-M (21k)24NoEfficientNetV2: Smaller Models and Faster Training2021-04-01Code
132NASNET-A(6)23.8NoLearning Transferable Architectures for Scalable...2017-07-21Code
133MaxViT-B (224res)23.4NoMaxViT: Multi-Axis Vision Transformer2022-04-04Code
134CAFormer-B36 (224 res, 21K)23.2NoMetaFormer Baselines for Vision2022-10-24Code
135CAFormer-B36 (224 res)23.2NoMetaFormer Baselines for Vision2022-10-24Code
136UniNet-B523.2NoUniNet: Unified Architecture Search with Convolu...2021-10-08-
137MetaFormer PoolFormer-M4823.2NoMetaFormer Is Actually What You Need for Vision2021-11-22Code
138ConvFormer-B36 (224 res, 21K)22.6NoMetaFormer Baselines for Vision2022-10-24Code
139ConvFormer-B36 (224 res)22.6NoMetaFormer Baselines for Vision2022-10-24Code
140ConvFormer-S36 (384 res, 21K)22.4NoMetaFormer Baselines for Vision2022-10-24Code
141ConvFormer-S36 (384 res)22.4NoMetaFormer Baselines for Vision2022-10-24Code
142Oct-ResNet-152 (SE)22.2NoDrop an Octave: Reducing Spatial Redundancy in C...2019-04-10Code
143RevBiFPN-S521.8NoRevBiFPN: The Fully Reversible Bidirectional Fea...2022-06-28Code
144UniNet-B520.4NoUniNet: Unified Architecture Search with Convolu...2022-07-12Code
145EfficientViT-L2 (r384)20NoEfficientViT: Multi-Scale Linear Attention for H...2022-05-29Code
146T2T-ViTt-1919.6NoTokens-to-Token ViT: Training Vision Transformer...2021-01-28Code
147TinySaver(ConvNeXtV2_h, 0.5 Acc drop)19.41NoTiny Models are the Computational Saver for Larg...2024-03-26Code
148CAIT-XS-2419.3NoGoing deeper with Image Transformers2021-03-31Code
149BoTNet T519.3NoBottleneck Transformers for Visual Recognition2021-01-27Code
150EfficientNet-B619NoEfficientNet: Rethinking Model Scaling for Convo...2019-05-28Code
151MIRL(ViT-S-54)18.8NoMasked Image Residual Learning for Scaling Deepe...2023-09-25Code
152ResNeXt-101, 64x4d, S=2(224px)18.8NoTowards Better Accuracy-efficiency Trade-offs: D...2020-11-30Code
153CLCNet (S:B4+D:B7)18.58NoCLCNet: Rethinking of Ensemble Modeling with Cla...2022-05-19Code
154SReT-S (384 res, ImageNet-1K only)18.5NoSliced Recursive Transformer2021-11-09Code
155RepVGG-B218.4NoRepVGG: Making VGG-style ConvNets Great Again2021-01-11Code
156FasterViT-318.2NoFasterViT: Fast Vision Transformers with Hierarc...2023-06-09Code
157Transformer local-attention (NesT-B)17.9NoNested Hierarchical Transformer: Towards Accurat...2021-05-26Code
158RVT-B*17.7NoTowards Robust Vision Transformer2021-05-17Code
159VAN-B5 (22K)17.2NoVisual Attention Network2022-02-20Code
160KAT-B*17.06NoKolmogorov-Arnold Transformer2024-09-16Code
161ConViT-B17NoConViT: Improving Vision Transformers with Soft ...2021-03-19Code
162GLiT-Bases17NoGLiT: Neural Architecture Search for Global and ...2021-07-07Code
163T2T-ViT-1917NoTokens-to-Token ViT: Training Vision Transformer...2021-01-28Code
164DeiT-B16.87NoKolmogorov-Arnold Transformer2024-09-16Code
165ViT-B/1616.87NoKolmogorov-Arnold Transformer2024-09-16Code
166Pyramid ViG-B16.8NoVision GNN: An Image is Worth Graph of Nodes2022-06-01Code
167DAT-B++ (224x224)16.6NoDAT++: Spatially Dynamic Vision Transformer with...2023-09-04Code
168Sequencer2D-L16.6NoSequencer: Deep LSTM for Image Classification2022-05-04Code
169MixMIM-B16.3NoMixMAE: Mixed and Masked Autoencoder for Efficie...2022-05-26Code
170CvT-13 (384 res)16.3NoCvT: Introducing Convolutions to Vision Transfor...2021-03-29Code
171InternImage-B16NoInternImage: Exploring Large-Scale Vision Founda...2022-11-10Code
172LV-ViT-M16NoAll Tokens Matter: Token Labeling for Training B...2021-04-22Code
173MogaNet-L15.9NoMogaNet: Multi-order Gated Aggregation Network2022-11-07Code
174Assemble-ResNet15215.8NoCompounding the Performance Improvements of Asse...2020-01-17Code
175BossNet-T115.8NoBossNAS: Exploring Hybrid CNN-transformers with ...2021-03-23Code
176CoAtNet-215.7NoCoAtNet: Marrying Convolution and Attention for ...2021-06-09Code
177DaViT-B15.5NoDaViT: Dual Attention Vision Transformers2022-04-07Code
178ViT-S @384 (DeiT III)15.5NoDeiT III: Revenge of the ViT2022-04-14Code
179RDNet-B15.4NoDenseNets Reloaded: Paradigm Shift Beyond ResNet...2024-03-28Code
180DeepMAD-89M15.4NoDeepMAD: Mathematical Architecture Design for De...2023-03-05Code
181Shift-B15.2NoWhen Shift Operation Meets Vision Transformer: A...2022-01-26Code
182Twins-SVT-L15.1NoTwins: Revisiting the Design of Spatial Attentio...2021-04-28Code
183MambaVision-B15NoMambaVision: A Hybrid Mamba-Transformer Vision B...2024-07-10Code
184Wave-ViT-L14.8NoWave-ViT: Unifying Wavelet and Transformers for ...2022-07-11Code
185GC ViT-B14.8NoGlobal Context Vision Transformers2022-06-20Code
186CAIT-XXS-3614.3NoGoing deeper with Image Transformers2021-03-31Code
187ZenNAS (0.8ms)13.9NoZen-NAS: A Zero-Shot NAS for High-Performance De...2021-02-01Code
188TinyViT-21M-384-distill (384 res, 21k)13.8NoTinyViT: Fast Pretraining Distillation for Small...2022-07-21Code
189DiNAT-Base13.7NoDilated Neighborhood Attention Transformer2022-09-29Code
190NAT-Base13.7NoNeighborhood Attention Transformer2022-04-14Code
191HRFormer-B13.7NoHRFormer: High-Resolution Transformer for Dense ...2021-10-18Code
192CAFormer-S18 (384 res, 21K)13.4NoMetaFormer Baselines for Vision2022-10-24Code
193CAFormer-S18 (384 res)13.4NoMetaFormer Baselines for Vision2022-10-24Code
194ViL-Base-D13.4NoMulti-Scale Vision Longformer: A New Vision Tran...2021-03-29Code
195CAFormer-M36 (224 res, 21K)13.2NoMetaFormer Baselines for Vision2022-10-24Code
196CAFormer-M36 (224 res)13.2NoMetaFormer Baselines for Vision2022-10-24Code
197LITv2-B13.2NoFast Vision Transformers with HiLo Attention2022-05-26Code
198GTP-DeiT-B/P813.1NoGTP-ViT: Efficient Vision Transformers via Graph...2023-11-06Code
199CeiT-S (384 finetune res)12.9NoIncorporating Convolution Designs into Visual Tr...2021-03-22Code
200ConvFormer-M36 (224 res, 21K)12.8NoMetaFormer Baselines for Vision2022-10-24Code
201ConvFormer-M36 (224 res)12.8NoMetaFormer Baselines for Vision2022-10-24Code
202UniFormer-L12.6NoUniFormer: Unifying Convolution and Self-attenti...2022-01-24Code
203PiT-B12.5NoRethinking Spatial Dimensions of Vision Transfor...2021-03-30Code
204NFNet-F012.38NoHigh-Performance Large-Scale Image Recognition W...2021-02-11Code
205CycleMLP-B512.3NoCycleMLP: A MLP-like Architecture for Dense Pred...2021-07-21Code
206VAN-B4 (22K)12.2NoVisual Attention Network2022-02-20Code
207ViTAE-S-Stage12NoViTAE: Vision Transformer Advanced by Exploring ...2021-06-07Code
208PVTv2-B411.8NoPVT v2: Improved Baselines with Pyramid Vision T...2021-06-25Code
209MaxViT-S (224res)11.7NoMaxViT: Multi-Axis Vision Transformer2022-04-04Code
210ConvFormer-S18 (384 res, 21K)11.6NoMetaFormer Baselines for Vision2022-10-24Code
211ConvFormer-S18 (384 res)11.6NoMetaFormer Baselines for Vision2022-10-24Code
212ResNet-15211.3NoDeep Residual Learning for Image Recognition2015-12-10Code
213RepVGG-B2g411.3NoRepVGG: Making VGG-style ConvNets Great Again2021-01-11Code
214ScaleNet-15211.2NoData-Driven Neuron Allocation for Scale Aggregat...2019-04-20Code
215Sequencer2D-M11.1NoSequencer: Deep LSTM for Image Classification2022-05-04Code
216CCT-14/7x211.06NoEscaping the Big Data Paradigm with Compact Tran...2021-04-12Code
217EfficientViT-L2 (r288)11NoEfficientViT: Multi-Scale Linear Attention for H...2022-05-29Code
218AutoFormer-base11NoAutoFormer: Searching Transformers for Visual Re...2021-07-01Code
219BoTNet T410.9NoBottleneck Transformers for Visual Recognition2021-01-27Code
220ECA-Net (ResNet-152)10.83NoECA-Net: Efficient Channel Attention for Deep Co...2019-10-08Code
221VVT-L (224 res)10.8NoVicinity Vision Transformer2022-06-21Code
222RevBiFPN-S410.6NoRevBiFPN: The Fully Reversible Bidirectional Fea...2022-06-28Code
223Transformer local-attention (NesT-S)10.4NoNested Hierarchical Transformer: Towards Accurat...2021-05-26Code
224TransNeXt-Small (IN-1K supervised, 224)10.3NoTransNeXt: Robust Foveal Visual Perception for V...2023-11-28Code
225ConViT-S+10NoConViT: Improving Vision Transformers with Soft ...2021-03-19Code
226MogaNet-B9.9NoMogaNet: Multi-order Gated Aggregation Network2022-11-07Code
227UniNet-B49.9NoUniNet: Unified Architecture Search with Convolu...2021-10-08-
228EfficientNet-B59.9NoEfficientNet: Rethinking Model Scaling for Convo...2019-05-28Code
229DeiT-S with iRPE-QKV9.77NoRethinking and Improving Relative Position Encod...2021-07-29Code
230QnA-ViT-Base9.7NoLearned Queries for Efficient Local Attention2021-12-21Code
231T2T-ViT-149.6NoTokens-to-Token ViT: Training Vision Transformer...2021-01-28Code
232CAIT-XXS-249.6NoGoing deeper with Image Transformers2021-03-31Code
233CrossViT-18+9.5NoCrossViT: Cross-Attention Multi-Scale Vision Tra...2021-03-27Code
234DeiT-S with iRPE-QK9.412NoRethinking and Improving Relative Position Encod...2021-07-29Code
235DAT-S++9.4NoDAT++: Spatially Dynamic Vision Transformer with...2023-09-04Code
236CentroidViT-S (arXiv, 2021-02)9.4NoCentroid Transformers: Learning to Abstract with...2021-02-17-
237DeiT-S with iRPE-K9.318NoRethinking and Improving Relative Position Encod...2021-07-29Code
238SpineNet-1439.1NoSpineNet: Learning Scale-Permuted Backbone for R...2019-12-10Code
239DAT-S9NoVision Transformer with Deformable Attention2022-01-03Code
240CrossViT-189NoCrossViT: Cross-Attention Multi-Scale Vision Tra...2021-03-27Code
241Pyramid ViG-M8.9NoVision GNN: An Image is Worth Graph of Nodes2022-06-01Code
242EfficientNetV2-S (21k)8.8NoEfficientNetV2: Smaller Models and Faster Training2021-04-01Code
243FasterViT-28.7NoFasterViT: Fast Vision Transformers with Hierarc...2023-06-09Code
244RDNet-S8.7NoDenseNets Reloaded: Paradigm Shift Beyond ResNet...2024-03-28Code
245ViL-Medium-D8.7NoMulti-Scale Vision Longformer: A New Vision Tran...2021-03-29Code
246GFNet-H-B8.6NoGlobal Filter Networks for Image Classification2021-07-01Code
247GC ViT-S8.5NoGlobal Context Vision Transformers2022-06-20Code
248SE-CoTNetD-1018.5NoContextual Transformer Networks for Visual Recog...2021-07-26Code
249Shift-S8.5NoWhen Shift Operation Meets Vision Transformer: A...2022-01-26Code
250SKNet-1018.46NoSelective Kernel Networks2019-03-15Code
251CoAtNet-18.4NoCoAtNet: Marrying Convolution and Attention for ...2021-06-09Code
252SCARLET-A48.4NoSCARLET-NAS: Bridging the Gap between Stability ...2019-08-16Code
253Sequencer2D-S8.4NoSequencer: Deep LSTM for Image Classification2022-05-04Code
254Next-ViT-B8.3NoNext-ViT: Next Generation Vision Transformer for...2022-07-12Code
255Container Container8.1NoContainer: Context Aggregation Network2021-06-02Code
256CAFormer-S36 (224 res, 21K)8NoMetaFormer Baselines for Vision2022-10-24Code
257ELSA-VOLO-D18NoELSA: Enhanced Local Self-Attention for Vision T...2021-12-23Code
258CAFormer-S36 (224 res)8NoMetaFormer Baselines for Vision2022-10-24Code
259InternImage-S8NoInternImage: Exploring Large-Scale Vision Founda...2022-11-10Code
260GTP-LV-ViT-M/P88NoGTP-ViT: Efficient Vision Transformers via Graph...2023-11-06Code
261RegNetY-8.0GF8NoDesigning Network Design Spaces2020-03-30Code
262ResT-Large7.9NoResT: An Efficient Transformer for Visual Recogn...2021-05-28Code
263RandWire-WS7.9NoExploring Randomly Wired Neural Networks for Ima...2019-04-02Code
264SGE-ResNet1017.858NoSpatial Group-wise Enhance: Improving Semantic F...2019-05-23Code
265DiNAT-Small7.8NoDilated Neighborhood Attention Transformer2022-09-29Code
266NAT-Small7.8NoNeighborhood Attention Transformer2022-04-14Code
267IPT-B7.8NoIncepFormer: Efficient Inception Transformer wit...2022-12-06Code
268MViT-B-167.8NoMultiscale Vision Transformers2021-04-22Code
269ConvFormer-S36 (224 res, 21K)7.6NoMetaFormer Baselines for Vision2022-10-24Code
270ConvFormer-S36 (224 res)7.6NoMetaFormer Baselines for Vision2022-10-24Code
271ResNet-1017.6NoDeep Residual Learning for Image Recognition2015-12-10Code
272AOGNet-40M-AN7.51NoAttentive Normalization2019-08-04Code
273LITv2-M7.5NoFast Vision Transformers with HiLo Attention2022-05-26Code
274MambaVision-S7.5NoMambaVision: A Hybrid Mamba-Transformer Vision B...2024-07-10Code
275ScaleNet-1017.5NoData-Driven Neuron Allocation for Scale Aggregat...2019-04-20Code
276ECA-Net (ResNet-101)7.35NoECA-Net: Efficient Channel Attention for Deep Co...2019-10-08Code
277BoTNet T37.3NoBottleneck Transformers for Visual Recognition2021-01-27Code
278Wave-ViT-B7.2NoWave-ViT: Unifying Wavelet and Transformers for ...2022-07-11Code
279CvT-217.1NoCvT: Introducing Convolutions to Vision Transfor...2021-03-29Code
280HCGNet-C7.1NoGated Convolutional Networks with Hybrid Connect...2019-08-26Code
281gSwin-S7NogSwin: Gated MLP Vision Model with Hierarchical ...2022-08-24-
282PVTv2-B36.9NoPVT v2: Improved Baselines with Pyramid Vision T...2021-06-25Code
283ViTAE-13M6.8NoViTAE: Vision Transformer Advanced by Exploring ...2021-06-07Code
284RedNet-1526.8NoInvolution: Inverting the Inherence of Convoluti...2021-03-10Code
285ViL-Base-W6.74NoMulti-Scale Vision Longformer: A New Vision Tran...2021-03-29Code
286LV-ViT-S6.6NoAll Tokens Matter: Token Labeling for Training B...2021-04-22Code
287EfficientViT-B3 (r288)6.5NoEfficientViT: Multi-Scale Linear Attention for H...2022-05-29Code
288CI2P-ViT6.442NoCompress image to patches for Vision Transformer2025-02-14Code
289CrossViT-15+6.1NoCrossViT: Cross-Attention Multi-Scale Vision Tra...2021-03-27Code
290ResMLP-S246NoResMLP: Feedforward networks for image classific...2021-05-07Code
291Next-ViT-S5.8NoNext-ViT: Next Generation Vision Transformer for...2022-07-12Code
292Transformer local-attention (NesT-T)5.8NoNested Hierarchical Transformer: Towards Accurat...2021-05-26Code
293CrossViT-155.8NoCrossViT: Cross-Attention Multi-Scale Vision Tra...2021-03-27Code
294TransNeXt-Tiny (IN-1K supervised, 224)5.7NoTransNeXt: Robust Foveal Visual Perception for V...2023-11-28Code
295MOAT-0 1K only5.7NoMOAT: Alternating Mobile Convolution and Attenti...2022-10-04Code
296MaxViT-T (224res)5.6NoMaxViT: Multi-Axis Vision Transformer2022-04-04Code
297ConViT-S5.4NoConViT: Improving Vision Transformers with Soft ...2021-03-19Code
298ResNeSt-505.39NoResNeSt: Split-Attention Networks2020-04-19Code
299EfficientViT-L1 (r224)5.3NoEfficientViT: Multi-Scale Linear Attention for H...2022-05-29Code
300FasterViT-15.3NoFasterViT: Fast Vision Transformers with Hierarc...2023-06-09Code
301MambaVision-T25.1NoMambaVision: A Hybrid Mamba-Transformer Vision B...2024-07-10Code
302AutoFormer-small5.1NoAutoFormer: Searching Transformers for Visual Re...2021-07-01Code
303MogaNet-S5NoMogaNet: Multi-order Gated Aggregation Network2022-11-07Code
304RDNet-T5NoDenseNets Reloaded: Paradigm Shift Beyond ResNet...2024-03-28Code
305VAN-B25NoVisual Attention Network2022-02-20Code
306Visformer-S4.9NoVisformer: The Vision-friendly Transformer2021-04-26Code
307ViL-Small4.86NoMulti-Scale Vision Longformer: A New Vision Tran...2021-03-29Code
308ELSA-Swin-T4.8NoELSA: Enhanced Local Self-Attention for Vision T...2021-12-23Code
309GTP-LV-ViT-S/P84.8NoGTP-ViT: Efficient Vision Transformers via Graph...2023-11-06Code
310LocalViT-PVT4.8NoLocalViT: Bringing Locality to Vision Transformers2021-04-12Code
311Wave-ViT-S4.7NoWave-ViT: Unifying Wavelet and Transformers for ...2022-07-11Code
312GC ViT-T4.7NoGlobal Context Vision Transformers2022-06-20Code
313IPT-S4.7NoIncepFormer: Efficient Inception Transformer wit...2022-12-06Code
314MViTv2-T4.7NoMViTv2: Improved Multiscale Vision Transformers ...2021-12-02Code
315RVT-S*4.7NoTowards Robust Vision Transformer2021-05-17Code
316RedNet-1014.7NoInvolution: Inverting the Inherence of Convoluti...2021-03-10Code
317ResNet-RS-50 (160 image res)4.6NoRevisiting ResNets: Improved Training and Scalin...2021-03-13Code
318Pyramid ViG-S4.6NoVision GNN: An Image is Worth Graph of Nodes2022-06-01Code
319DAT-T4.6NoVision Transformer with Deformable Attention2022-01-03Code
320LocalViT-S4.6NoLocalViT: Bringing Locality to Vision Transformers2021-04-12Code
321ViTAE-T-Stage4.6NoViTAE: Vision Transformer Advanced by Exploring ...2021-06-07Code
322ConvNeXt-T4.5NoA ConvNet for the 2020s2022-01-10Code
323CeiT-S4.5NoIncorporating Convolution Designs into Visual Tr...2021-03-22Code
324CvT-134.5NoCvT: Introducing Convolutions to Vision Transfor...2021-03-29Code
325Swin-T4.5NoSwin Transformer: Hierarchical Vision Transforme...2021-03-25Code
326QnA-ViT-Small4.4NoLearned Queries for Efficient Local Attention2021-12-21Code
327MambaVision-T4.4NoMambaVision: A Hybrid Mamba-Transformer Vision B...2024-07-10Code
328Shift-T4.4NoWhen Shift Operation Meets Vision Transformer: A...2022-01-26Code
329GLiT-Smalls4.4NoGLiT: Neural Architecture Search for Global and ...2021-07-07Code
330ResNeSt-50-fast4.34NoResNeSt: Split-Attention Networks2020-04-19Code
331TinyViT-21M-distill (21k)4.3NoTinyViT: Fast Pretraining Distillation for Small...2022-07-21Code
332DAT-T++4.3NoDAT++: Spatially Dynamic Vision Transformer with...2023-09-04Code
333NAT-Tiny4.3NoNeighborhood Attention Transformer2022-04-14Code
334TinyViT-21M4.3NoTinyViT: Fast Pretraining Distillation for Small...2022-07-21Code
335DiNAT-Tiny4.3NoDilated Neighborhood Attention Transformer2022-09-29Code
336Mixer-S16 + STD4.3No--Code
337EfficientNet-B44.2NoEfficientNet: Rethinking Model Scaling for Convo...2019-05-28Code
338CoAtNet-04.2NoCoAtNet: Marrying Convolution and Attention for ...2021-06-09Code
339SGE-ResNet504.127NoSpatial Group-wise Enhance: Improving Semantic F...2019-05-23Code
340CAFormer-S18 (224 res, 21K)4.1NoMetaFormer Baselines for Vision2022-10-24Code
341CAFormer-S18 (224 res)4.1NoMetaFormer Baselines for Vision2022-10-24Code
342CvT-13-NAS4.1NoCvT: Introducing Convolutions to Vision Transfor...2021-03-29Code
343SE-CoTNetD-504.1NoContextual Transformer Networks for Visual Recog...2021-07-26Code
344EfficientViT-B3 (r224)4NoEfficientViT: Multi-Scale Linear Attention for H...2022-05-29Code
345CycleMLP-B2 + STD4No--Code
346PVTv2-B24NoPVT v2: Improved Baselines with Pyramid Vision T...2021-06-25Code
347ActiveMLP-T4NoActive Token Mixer2022-03-11Code
348RegNetY-4.0GF4NoDesigning Network Design Spaces2020-03-30Code
349ViTAE-6M4NoViTAE: Vision Transformer Advanced by Exploring ...2021-06-07Code
350ConvFormer-S18 (224 res, 21K)3.9NoMetaFormer Baselines for Vision2022-10-24Code
351ConvFormer-S18 (224 res)3.9NoMetaFormer Baselines for Vision2022-10-24Code
352ECA-Net (ResNet-50)3.86NoECA-Net: Efficient Channel Attention for Deep Co...2019-10-08Code
353ScaleNet-503.8NoData-Driven Neuron Allocation for Scale Aggregat...2019-04-20Code
354ResNet-503.8NoDeep Residual Learning for Image Recognition2015-12-10Code
355LITv2-S3.7NoFast Vision Transformers with HiLo Attention2022-05-26Code
356DY-ResNet-183.7NoDynamic Convolution: Attention over Convolution ...2019-12-07Code
357UniFormer-S3.6NoUniFormer: Unifying Convolution and Self-attenti...2022-01-24Code
358gSwin-T3.6NogSwin: Gated MLP Vision Model with Hierarchical ...2022-08-24-
359CeiT-T (384 finetune res)3.6NoIncorporating Convolution Designs into Visual Tr...2021-03-22Code
360CAS-ViT-T3.597NoCAS-ViT: Convolutional Additive Self-attention V...2024-08-07Code
361EdgeFormer-S3.48NoParC-Net: Position Aware Circular Convolution wi...2022-03-08Code
362ReXNet_3.03.4NoRethinking Channel Dimensions for Efficient Mode...2020-07-02Code
363GTP-DeiT-S/P83.4NoGTP-ViT: Efficient Vision Transformers via Graph...2023-11-06Code
364RevBiFPN-S33.33NoRevBiFPN: The Fully Reversible Bidirectional Fea...2022-06-28Code
365FasterViT-03.3NoFasterViT: Fast Vision Transformers with Hierarc...2023-06-09Code
366Container-Light3.2NoContainer: Context Aggregation Network2021-06-02Code
367ResMLP-12 (distilled, class-MLP)3NoResMLP: Feedforward networks for image classific...2021-05-07Code
368ViTAE-T3NoViTAE: Vision Transformer Advanced by Exploring ...2021-06-07Code
369MobileOne-S42.978NoMobileOne: An Improved One millisecond Mobile Ba...2022-06-08Code
370PiT-S2.9NoRethinking Spatial Dimensions of Vision Transfor...2021-03-30Code
371MobileOne-S4 (distill)2.9NoMobileOne: An Improved One millisecond Mobile Ba...2022-06-08Code
372TransNeXt-Micro (IN-1K supervised, 224)2.7NoTransNeXt: Robust Foveal Visual Perception for V...2023-11-28Code
373NAT-Mini2.7NoNeighborhood Attention Transformer2022-04-14Code
374DiNAT-Mini2.7NoDilated Neighborhood Attention Transformer2022-09-29Code
375RedNet-502.7NoInvolution: Inverting the Inherence of Convoluti...2021-03-10Code
376GC ViT-XT2.6NoGlobal Context Vision Transformers2022-06-20Code
377EdgeNeXt-S2.6NoEdgeNeXt: Efficiently Amalgamated CNN-Transforme...2022-06-21Code
378LR-Net-262.6NoLocal Relation Networks for Image Recognition2019-04-25Code
379DeiT-Ti with iRPE-K2.568NoRethinking and Improving Relative Position Encod...2021-07-29Code
380QnA-ViT-Tiny2.5NoLearned Queries for Efficient Local Attention2021-12-21Code
381VAN-B12.5NoVisual Attention Network2022-02-20Code
382UniNet-B22.4NoUniNet: Unified Architecture Search with Convolu...2021-10-08-
383HVT-S-12.4NoScalable Vision Transformers with Hierarchical P...2021-03-19Code
384LeViT-3842.334NoLeViT: a Vision Transformer in ConvNet's Clothin...2021-04-02Code
385IPT-T2.3NoIncepFormer: Efficient Inception Transformer wit...2022-12-06Code
386gSwin-VT2.3NogSwin: Gated MLP Vision Model with Hierarchical ...2022-08-24-
387RedNet-382.2NoInvolution: Inverting the Inherence of Convoluti...2021-03-10Code
388Ghost-ResNet-50 (s=2)2.2NoGhostNet: More Features from Cheap Operations2019-11-27Code
389FBNetV5-F-CLS2.1NoFBNetV5: Neural Architecture Search for Multiple...2021-11-19-
390EfficientViT-B2 (r256)2.1NoEfficientViT: Multi-Scale Linear Attention for H...2022-05-29Code
391GC ViT-XXT2.1NoGlobal Context Vision Transformers2022-06-20Code
392PVTv2-B12.1NoPVT v2: Improved Baselines with Pyramid Vision T...2021-06-25Code
393TinyViT-11M-distill (21k)2NoTinyViT: Fast Pretraining Distillation for Small...2022-07-21Code
394CloFormer-S2NoRethinking Local Perception in Lightweight Visio...2023-03-31Code
395TinyViT-11M2NoTinyViT: Fast Pretraining Distillation for Small...2022-07-21Code
396HCGNet-B2NoGated Convolutional Networks with Hybrid Connect...2019-08-26Code
397ConViT-Ti+2NoConViT: Improving Vision Transformers with Soft ...2021-03-19Code
398ResT-Small1.9NoResT: An Efficient Transformer for Visual Recogn...2021-05-28Code
399MobileOne-S31.896NoMobileOne: An Improved One millisecond Mobile Ba...2022-06-08Code
400CAS-ViT-M1.887NoCAS-ViT: Convolutional Additive Self-attention V...2024-08-07Code
401NASViT (supernet)1.881No--Code
402MobileViTv3-1.01.876NoMobileViTv3: Mobile-Friendly Vision Transformer ...2022-09-30Code
403MobileViTv3-S1.841NoMobileViTv3: Mobile-Friendly Vision Transformer ...2022-09-30Code
404DY-ResNet-101.82NoDynamic Convolution: Attention over Convolution ...2019-12-07Code
405HRFormer-T1.8NoHRFormer: High-Resolution Transformer for Dense ...2021-10-18Code
406MobileViTv2-1.01.8NoSeparable Self-attention for Mobile Vision Trans...2022-06-06Code
407DVT (T2T-ViT-12)1.7NoNot All Images are Worth 16x16 Words: Dynamic Tr...2021-05-31Code
408Pyramid ViG-Ti1.7NoVision GNN: An Image is Worth Graph of Nodes2022-06-01Code
409RedNet-261.7NoInvolution: Inverting the Inherence of Convoluti...2021-03-10Code
410FixEfficientNet-B01.6NoFixing the train-test resolution discrepancy: Fi...2020-03-18Code
411RegNetY-1.6GF1.6NoDesigning Network Design Spaces2020-03-30Code
412ReXNet_2.01.5NoRethinking Channel Dimensions for Efficient Mode...2020-07-02Code
413MogaNet-T (256res)1.44NoMogaNet: Multi-order Gated Aggregation Network2022-11-07Code
414PiT-XS1.4NoRethinking Spatial Dimensions of Vision Transfor...2021-03-30Code
415GLiT-Tinys1.4NoGLiT: Neural Architecture Search for Global and ...2021-07-07Code
416LocalViT-TNT1.4NoLocalViT: Bringing Locality to Vision Transformers2021-04-12Code
417RevBiFPN-S21.37NoRevBiFPN: The Fully Reversible Bidirectional Fea...2022-06-28Code
418TinyViT-5M-distill (21k)1.3NoTinyViT: Fast Pretraining Distillation for Small...2022-07-21Code
419RVT-Ti*1.3NoTowards Robust Vision Transformer2021-05-17Code
420TinyViT-5M1.3NoTinyViT: Fast Pretraining Distillation for Small...2022-07-21Code
421Visformer-Ti1.3NoVisformer: The Vision-friendly Transformer2021-04-26Code
422ViL-Tiny-RPB1.3NoMulti-Scale Vision Longformer: A New Vision Tran...2021-03-29Code
423LocalViT-T1.3NoLocalViT: Bringing Locality to Vision Transformers2021-04-12Code
424AutoFormer-tiny1.3NoAutoFormer: Searching Transformers for Visual Re...2021-07-01Code
425MobileOne-S21.299NoMobileOne: An Improved One millisecond Mobile Ba...2022-06-08Code
426SReT-LT (Fast Knowledge Distillation)1.2NoA Fast Knowledge Distillation Framework for Visu...2021-12-02Code
427CeiT-T1.2NoIncorporating Convolution Designs into Visual Tr...2021-03-22Code
428Ghost-ResNet-50 (s=4)1.2NoGhostNet: More Features from Cheap Operations2019-11-27Code
429LocalViT-T2T1.2NoLocalViT: Bringing Locality to Vision Transformers2021-04-12Code
430MobileNet-224 (CGD)1.198NoCompact Global Descriptor for Neural Networks2019-07-23Code
431MobileNetV2 (1.4)1.17NoMobileNetV2: Inverted Residuals and Linear Bottl...2018-01-13Code
432MobileNet-224 ×1.251.138NoMobileNets: Efficient Convolutional Neural Netwo...2017-04-17Code
433CloFormer-XS1.1NoRethinking Local Perception in Lightweight Visio...2023-03-31Code
434SReT-T1.1NoSliced Recursive Transformer2021-11-09Code
435LeViT-2561.066NoLeViT: a Vision Transformer in ConvNet's Clothin...2021-04-02Code
436MobileViTv3-0.751.064NoMobileViTv3: Mobile-Friendly Vision Transformer ...2022-09-30Code
437MogaNet-XT (256res)1.04NoMogaNet: Multi-order Gated Aggregation Network2022-11-07Code
438FBNetV5-C-CLS1NoFBNetV5: Neural Architecture Search for Multiple...2021-11-19-
439EfficientNet-B21NoEfficientNet: Rethinking Model Scaling for Convo...2019-05-28Code
440MobileViTv2-0.751NoSeparable Self-attention for Mobile Vision Trans...2022-06-06Code
441ConViT-Ti1NoConViT: Improving Vision Transformers with Soft ...2021-03-19Code
442UniNet-B10.99NoUniNet: Unified Architecture Search with Convolu...2021-10-08-
443CAS-ViT-S0.932NoCAS-ViT: Convolutional Additive Self-attention V...2024-08-07Code
444MobileViTv3-XS0.927NoMobileViTv3: Mobile-Friendly Vision Transformer ...2022-09-30Code
445VAN-B00.9NoVisual Attention Network2022-02-20Code
446ReXNet_1.50.86NoRethinking Channel Dimensions for Efficient Mode...2020-07-02Code
447EfficientNet-B0 (CondConv)0.826NoCondConv: Conditionally Parameterized Convolutio...2019-04-10Code
448MobileOne-S10.825NoMobileOne: An Improved One millisecond Mobile Ba...2022-06-08Code
449ZenNet-400M-SE0.82NoZen-NAS: A Zero-Shot NAS for High-Performance De...2021-02-01Code
450MnasNet-A30.806NoMnasNet: Platform-Aware Neural Architecture Sear...2018-07-31Code
451RegNetY-800MF0.8NoDesigning Network Design Spaces2020-03-30Code
452FairNAS-A0.776NoFairNAS: Rethinking Evaluation Fairness of Weigh...2019-07-03Code
453NASViT-A50.757No--Code
454SCARLET-A0.73NoSCARLET-NAS: Bridging the Gap between Stability ...2019-08-16Code
455FBNetV50.726NoFBNetV5: Neural Architecture Search for Multiple...2021-11-19-
456AlphaNet-A60.709NoAlphaNet: Improved Training of Supernets with Al...2021-02-16Code
457DVT (T2T-ViT-10)0.7NoNot All Images are Worth 16x16 Words: Dynamic Tr...2021-05-31Code
458EfficientNet-B10.7NoEfficientNet: Rethinking Model Scaling for Convo...2019-05-28Code
459MobileViT-XS0.7NoMobileViT: Light-weight, General-purpose, and Mo...2021-10-05Code
460PiT-Ti0.7NoRethinking Spatial Dimensions of Vision Transfor...2021-03-30Code
461SReT-ExT0.7NoSliced Recursive Transformer2021-11-09Code
462FairNAS-B0.69NoFairNAS: Rethinking Evaluation Fairness of Weigh...2019-07-03Code
463FBNetV5-A-CLS0.685NoFBNetV5: Neural Architecture Search for Multiple...2021-11-19-
464MnasNet-A20.68NoMnasNet: Platform-Aware Neural Architecture Sear...2018-07-31Code
465ReXNet_1.30.66NoRethinking Channel Dimensions for Efficient Mode...2020-07-02Code
466SCARLET-B0.658NoSCARLET-NAS: Bridging the Gap between Stability ...2019-08-16Code
467FairNAS-C0.642NoFairNAS: Rethinking Evaluation Fairness of Weigh...2019-07-03Code
468HVT-Ti-10.64NoScalable Vision Transformers with Hierarchical P...2021-03-19Code
469MUXNet-l0.636NoMUXConv: Information Multiplexing in Convolution...2020-03-31Code
470LeViT-1920.624NoLeViT: a Vision Transformer in ConvNet's Clothin...2021-04-02Code
471MnasNet-A10.624NoMnasNet: Platform-Aware Neural Architecture Sear...2018-07-31Code
472RevBiFPN-S10.62NoRevBiFPN: The Fully Reversible Bidirectional Fea...2022-06-28Code
473MoGA-A0.608NoMoGA: Searching Beyond MobileNetV32019-08-04Code
474ESPNetv20.602NoESPNetv2: A Light-weight, Power Efficient, and G...2018-11-28Code
475DVT (T2T-ViT-7)0.6NoNot All Images are Worth 16x16 Words: Dynamic Tr...2021-05-31Code
476CloFormer-XXS0.6NoRethinking Local Perception in Lightweight Visio...2023-03-31Code
477RegNetY-600MF0.6NoDesigning Network Design Spaces2020-03-30Code
478MobileNetV20.6YesMobileNetV2: Inverted Residuals and Linear Bottl...2018-01-13Code
479PVTv2-B00.6NoPVT v2: Improved Baselines with Pyramid Vision T...2021-06-25Code
480ShuffleNet V20.597NoShuffleNet V2: Practical Guidelines for Efficien...2018-07-30Code
481NASViT-A40.591No--Code
482TinyNet (GhostNet-A)0.591NoModel Rubik's Cube: Twisting Resolution, Depth a...2020-10-28Code
483RandWire-WS (small)0.583NoExploring Randomly Wired Neural Networks for Ima...2019-04-02Code
484MixNet-L0.565NoMixConv: Mixed Depthwise Convolutional Kernels2019-07-22Code
485UniNet-B00.56NoUniNet: Unified Architecture Search with Convolu...2021-10-08-
486CAS-ViT-XS0.56NoCAS-ViT: Convolutional Additive Self-attention V...2024-08-07Code
487SCARLET-C0.56NoSCARLET-NAS: Bridging the Gap between Stability ...2019-08-16Code
488UniNet-B00.555NoUniNet: Unified Architecture Search with Convolu...2022-07-12Code
489DiCENet0.553NoDiCENet: Dimension-wise Convolutions for Efficie...2019-06-08Code
490NASViT-A30.528No--Code
491EdgeNeXt-XXS0.522NoEdgeNeXt: Efficiently Amalgamated CNN-Transforme...2022-06-21Code
492MobileViTv2-0.50.5NoSeparable Self-attention for Mobile Vision Trans...2022-06-06Code
493AlphaNet-A50.491NoAlphaNet: Improved Training of Supernets with Al...2021-02-16Code
494MobileViTv3-0.50.481NoMobileViTv3: Mobile-Friendly Vision Transformer ...2022-09-30Code
495AlphaNet-A40.444NoAlphaNet: Improved Training of Supernets with Al...2021-02-16Code
496MobileNet V3-Large 1.00.438NoSearching for MobileNetV32019-05-06Code
497MUXNet-m0.436NoMUXConv: Information Multiplexing in Convolution...2020-03-31Code
498DY-MobileNetV2 ×0.750.435NoDynamic Convolution: Attention over Convolution ...2019-12-07Code
499AsymmNet-Large ×1.00.4338NoAsymmNet: Towards ultralight convolution neural ...2021-04-15Code
500NASViT-A20.421No--Code
501ReXNet_1.00.4NoRethinking Channel Dimensions for Efficient Mode...2020-07-02Code
502RegNetY-400MF0.4NoDesigning Network Design Spaces2020-03-30Code
503DGPPF-ResNet500.4No--Code
504EfficientNet-B00.39NoEfficientNet: Rethinking Model Scaling for Convo...2019-05-28Code
505LeViT-1280.376NoLeViT: a Vision Transformer in ConvNet's Clothin...2021-04-02Code
506FBNet-C0.375NoFBNet: Hardware-Aware Efficient ConvNet Design v...2018-12-09Code
507GreedyNAS-A0.366NoGreedyNAS: Towards Fast One-Shot NAS with Greedy...2020-03-25-
508SkipblockNet-L0.364NoBias Loss for Mobile Neural Networks2021-07-23Code
509MixNet-M0.36NoMixConv: Mixed Depthwise Convolutional Kernels2019-07-22Code
510AlphaNet-A30.357NoAlphaNet: Improved Training of Supernets with Al...2021-02-16Code
511ReXNet_0.90.35NoRethinking Channel Dimensions for Efficient Mode...2020-07-02Code
512TinyNet-A + RA0.339NoModel Rubik's Cube: Twisting Resolution, Depth a...2020-10-28Code
513GreedyNAS-B0.324NoGreedyNAS: Towards Fast One-Shot NAS with Greedy...2020-03-25-
514ECA-Net (MobileNetV2)0.32NoECA-Net: Efficient Channel Attention for Deep Co...2019-10-08Code
515AlphaNet-A20.317NoAlphaNet: Improved Training of Supernets with Al...2021-02-16Code
516RevBiFPN-S00.31NoRevBiFPN: The Fully Reversible Bidirectional Fea...2022-06-28Code
517NASViT-A10.309No--Code
518MobileViTv3-XXS0.289NoMobileViTv3: Mobile-Friendly Vision Transformer ...2022-09-30Code
519LeViT-128S0.288NoLeViT: a Vision Transformer in ConvNet's Clothin...2021-04-02Code
520GreedyNAS-C0.284NoGreedyNAS: Towards Fast One-Shot NAS with Greedy...2020-03-25-
521FBNetV5-AC-CLS0.28NoFBNetV5: Neural Architecture Search for Multiple...2021-11-19-
522AlphaNet-A10.279NoAlphaNet: Improved Training of Supernets with Al...2021-02-16Code
523MobileOne-S0 (distill)0.275NoMobileOne: An Improved One millisecond Mobile Ba...2022-06-08Code
524MixNet-S0.256NoMixConv: Mixed Depthwise Convolutional Kernels2019-07-22Code
525SkipblockNet-M0.246NoBias Loss for Mobile Neural Networks2021-07-23Code
526MUXNet-s0.234NoMUXConv: Information Multiplexing in Convolution...2020-03-31Code
527GhostNet ×1.30.226NoGhostNet: More Features from Cheap Operations2019-11-27Code
528FBNetV5-AR-CLS0.215NoFBNetV5: Neural Architecture Search for Multiple...2021-11-19-
529CoE-Large + CondConv0.214NoCollaboration of Experts: Achieving 80% Top-1 Ac...2021-07-08-
530NASViT-A00.208No--Code
531AlphaNet-A00.203NoAlphaNet: Improved Training of Supernets with Al...2021-02-16Code
532DY-MobileNetV2 ×0.50.203NoDynamic Convolution: Attention over Convolution ...2019-12-07Code
533DGPPF-ResNet180.2No--Code
534BasisNet-MV30.198NoBasisNet: Two-stage Model Synthesis for Efficien...2021-05-07-
535CoE-Large0.194NoCollaboration of Experts: Achieving 80% Top-1 Ac...2021-07-08-
536GhostNet ×1.00.141NoGhostNet: More Features from Cheap Operations2019-11-27Code
537DY-MobileNetV3-Small0.137NoDynamic Convolution: Attention over Convolution ...2019-12-07Code
538AsymmNet-Large ×0.50.1344NoAsymmNet: Towards ultralight convolution neural ...2021-04-15Code
539MUXNet-xs0.132NoMUXConv: Information Multiplexing in Convolution...2020-03-31Code
540DY-MobileNetV2 ×0.350.124NoDynamic Convolution: Attention over Convolution ...2019-12-07Code
541AsymmNet-Small ×1.00.1154NoAsymmNet: Towards ultralight convolution neural ...2021-04-15Code
542CoE-Small + CondConv + PWLU0.1NoCollaboration of Experts: Achieving 80% Top-1 Ac...2021-07-08-
543DGPPF-MobileNetV20.1No--Code
544GhostNet ×0.50.042NoGhostNet: More Features from Cheap Operations2019-11-27Code