TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/EfficientNetV2: Smaller Models and Faster Training

EfficientNetV2: Smaller Models and Faster Training

Mingxing Tan, Quoc V. Le

2021-04-01Image ClassificationAutoMLData AugmentationNeural Architecture SearchClassification
PaperPDFCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCode(official)CodeCodeCodeCodeCodeCodeCodeCodeCode

Abstract

This paper introduces EfficientNetV2, a new family of convolutional networks that have faster training speed and better parameter efficiency than previous models. To develop this family of models, we use a combination of training-aware neural architecture search and scaling, to jointly optimize training speed and parameter efficiency. The models were searched from the search space enriched with new ops such as Fused-MBConv. Our experiments show that EfficientNetV2 models train much faster than state-of-the-art models while being up to 6.8x smaller. Our training can be further sped up by progressively increasing the image size during training, but it often causes a drop in accuracy. To compensate for this accuracy drop, we propose to adaptively adjust regularization (e.g., dropout and data augmentation) as well, such that we can achieve both fast training and good accuracy. With progressive learning, our EfficientNetV2 significantly outperforms previous models on ImageNet and CIFAR/Cars/Flowers datasets. By pretraining on the same ImageNet21k, our EfficientNetV2 achieves 87.3% top-1 accuracy on ImageNet ILSVRC2012, outperforming the recent ViT by 2.0% accuracy while training 5x-11x faster using the same computing resources. Code will be available at https://github.com/google/automl/tree/master/efficientnetv2.

Results

TaskDatasetMetricValueModel
Image ClassificationStanford CarsAccuracy95.1EfficientNetV2-L
Image ClassificationStanford CarsAccuracy94.6EfficientNetV2-M
Image ClassificationStanford CarsAccuracy93.8EfficientNetV2-S
Image ClassificationCIFAR-10Percentage correct99.1EfficientNetV2-L
Image ClassificationCIFAR-10Percentage correct99EfficientNetV2-M
Image ClassificationCIFAR-10Percentage correct98.7EfficientNetV2-S
Image ClassificationFlowers-102Accuracy98.8EfficientNetV2-L
Image ClassificationFlowers-102Accuracy98.5EfficientNetV2-M
Image ClassificationFlowers-102Accuracy97.9EfficientNetV2-S
Image ClassificationCIFAR-100Percentage correct92.3EfficientNetV2-L
Image ClassificationCIFAR-100Percentage correct92.2EfficientNetV2-M
Image ClassificationCIFAR-100Percentage correct91.5EfficientNetV2-S
Image ClassificationImageNetGFLOPs94EfficientNetV2-XL (21k)
Image ClassificationImageNetGFLOPs53EfficientNetV2-L (21k)
Image ClassificationImageNetGFLOPs24EfficientNetV2-M (21k)
Image ClassificationImageNetGFLOPs53EfficientNetV2-L
Image ClassificationImageNetGFLOPs8.8EfficientNetV2-S (21k)

Related Papers

Automatic Classification and Segmentation of Tunnel Cracks Based on Deep Learning and Visual Explanations2025-07-18Adversarial attacks to image classification systems using evolutionary algorithms2025-07-17Efficient Adaptation of Pre-trained Vision Transformer underpinned by Approximately Orthogonal Fine-Tuning Strategy2025-07-17Federated Learning for Commercial Image Sources2025-07-17MUPAX: Multidimensional Problem Agnostic eXplainable AI2025-07-17Overview of the TalentCLEF 2025: Skill and Job Title Intelligence for Human Capital Management2025-07-17Pixel Perfect MegaMed: A Megapixel-Scale Vision-Language Foundation Model for Generating High Resolution Medical Images2025-07-17DASViT: Differentiable Architecture Search for Vision Transformer2025-07-17