TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/sharpDARTS: Faster and More Accurate Differentiable Archit...

sharpDARTS: Faster and More Accurate Differentiable Architecture Search

Andrew Hundt, Varun Jain, Gregory D. Hager

2019-03-23Image ClassificationHyperparameter OptimizationNeural Architecture Search
PaperPDFCodeCode(official)

Abstract

Neural Architecture Search (NAS) has been a source of dramatic improvements in neural network design, with recent results meeting or exceeding the performance of hand-tuned architectures. However, our understanding of how to represent the search space for neural net architectures and how to search that space efficiently are both still in their infancy. We have performed an in-depth analysis to identify limitations in a widely used search space and a recent architecture search method, Differentiable Architecture Search (DARTS). These findings led us to introduce novel network blocks with a more general, balanced, and consistent design; a better-optimized Cosine Power Annealing learning rate schedule; and other improvements. Our resulting sharpDARTS search is 50% faster with a 20-30% relative improvement in final model error on CIFAR-10 when compared to DARTS. Our best single model run has 1.93% (1.98+/-0.07) validation error on CIFAR-10 and 5.5% error (5.8+/-0.3) on the recently released CIFAR-10.1 test set. To our knowledge, both are state of the art for models of similar size. This model also generalizes competitively to ImageNet at 25.1% top-1 (7.8% top-5) error. We found improvements for existing search spaces but does DARTS generalize to new domains? We propose Differentiable Hyperparameter Grid Search and the HyperCuboid search space, which are representations designed to leverage DARTS for more general parameter optimization. Here we find that DARTS fails to generalize when compared against a human's one shot choice of models. We look back to the DARTS and sharpDARTS search spaces to understand why, and an ablation study reveals an unusual generalization gap. We finally propose Max-W regularization to solve this problem, which proves significantly better than the handmade design. Code will be made available.

Results

TaskDatasetMetricValueModel
Neural Architecture SearchCIFAR-10 Image ClassificationPercentage error1.98SharpSepConvDARTS
Neural Architecture SearchCIFAR-10Search Time (GPU days)0.8SharpSepConvDARTS
Neural Architecture SearchCIFAR-10Search Time (GPU days)1.8sharpDARTS
Neural Architecture SearchImageNetAccuracy76sharpDARTS
Neural Architecture SearchImageNetTop-1 Error Rate24sharpDARTS
Neural Architecture SearchImageNetAccuracy74.1SharpSepConvDARTS
Neural Architecture SearchImageNetTop-1 Error Rate25.1SharpSepConvDARTS
AutoMLCIFAR-10 Image ClassificationPercentage error1.98SharpSepConvDARTS
AutoMLCIFAR-10Search Time (GPU days)0.8SharpSepConvDARTS
AutoMLCIFAR-10Search Time (GPU days)1.8sharpDARTS
AutoMLImageNetAccuracy76sharpDARTS
AutoMLImageNetTop-1 Error Rate24sharpDARTS
AutoMLImageNetAccuracy74.1SharpSepConvDARTS
AutoMLImageNetTop-1 Error Rate25.1SharpSepConvDARTS

Related Papers

Automatic Classification and Segmentation of Tunnel Cracks Based on Deep Learning and Visual Explanations2025-07-18Adversarial attacks to image classification systems using evolutionary algorithms2025-07-17Efficient Adaptation of Pre-trained Vision Transformer underpinned by Approximately Orthogonal Fine-Tuning Strategy2025-07-17Federated Learning for Commercial Image Sources2025-07-17MUPAX: Multidimensional Problem Agnostic eXplainable AI2025-07-17DASViT: Differentiable Architecture Search for Vision Transformer2025-07-17Are encoders able to learn landmarkers for warm-starting of Hyperparameter Optimization?2025-07-16Hashed Watermark as a Filter: Defeating Forging and Overwriting Attacks in Weight-based Neural Network Watermarking2025-07-15