iDARTS: Improving DARTS by Node Normalization and Decorrelation Discretization

Huiqun Wang, Ruijie Yang, Di Huang, Yunhong Wang

2021-08-25Neural Architecture Search

Abstract

Differentiable ARchiTecture Search (DARTS) uses a continuous relaxation of network representation and dramatically accelerates Neural Architecture Search (NAS) by almost thousands of times in GPU-day. However, the searching process of DARTS is unstable, which suffers severe degradation when training epochs become large, thus limiting its application. In this paper, we claim that this degradation issue is caused by the imbalanced norms between different nodes and the highly correlated outputs from various operations. We then propose an improved version of DARTS, namely iDARTS, to deal with the two problems. In the training phase, it introduces node normalization to maintain the norm balance. In the discretization phase, the continuous architecture is approximated based on the similarity between the outputs of the node and the decorrelated operations rather than the values of the architecture parameters. Extensive evaluation is conducted on CIFAR-10 and ImageNet, and the error rates of 2.25\% and 24.7\% are reported within 0.2 and 1.9 GPU-day for architecture search respectively, which shows its effectiveness. Additional analysis also reveals that iDARTS has the advantage in robustness and generalization over other DARTS-based counterparts.

Results

Task	Dataset	Metric	Value	Model
Neural Architecture Search	CIFAR-10	Search Time (GPU days)	0.4	iDARTS +ME
Neural Architecture Search	ImageNet	Top-1 Error Rate	24.7	iDARTS (ImageNet)
Neural Architecture Search	ImageNet	Top-1 Error Rate	25.2	iDARTS (CIFAR-10)
AutoML	CIFAR-10	Search Time (GPU days)	0.4	iDARTS +ME
AutoML	ImageNet	Top-1 Error Rate	24.7	iDARTS (ImageNet)
AutoML	ImageNet	Top-1 Error Rate	25.2	iDARTS (CIFAR-10)

Related Papers

DASViT: Differentiable Architecture Search for Vision Transformer2025-07-17 AnalogNAS-Bench: A NAS Benchmark for Analog In-Memory Computing2025-06-23 From Tiny Machine Learning to Tiny Deep Learning: A Survey2025-06-21 One-Shot Neural Architecture Search with Network Similarity Directed Initialization for Pathological Image Classification2025-06-17 DDS-NAS: Dynamic Data Selection within Neural Architecture Search via On-line Hard Example Mining applied to Image Classification2025-06-17 MARCO: Hardware-Aware Neural Architecture Search for Edge Devices with Multi-Agent Reinforcement Learning and Conformal Prediction Filtering2025-06-16 Finding Optimal Kernel Size and Dimension in Convolutional Neural Networks An Architecture Optimization Approach2025-06-16 Directed Acyclic Graph Convolutional Networks2025-06-13