TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/AttentiveNAS: Improving Neural Architecture Search via Att...

AttentiveNAS: Improving Neural Architecture Search via Attentive Sampling

Dilin Wang, Meng Li, Chengyue Gong, Vikas Chandra

2020-11-18CVPR 2021 1Neural Architecture Search
PaperPDFCodeCode(official)

Abstract

Neural architecture search (NAS) has shown great promise in designing state-of-the-art (SOTA) models that are both accurate and efficient. Recently, two-stage NAS, e.g. BigNAS, decouples the model training and searching process and achieves remarkable search efficiency and accuracy. Two-stage NAS requires sampling from the search space during training, which directly impacts the accuracy of the final searched models. While uniform sampling has been widely used for its simplicity, it is agnostic of the model performance Pareto front, which is the main focus in the search process, and thus, misses opportunities to further improve the model accuracy. In this work, we propose AttentiveNAS that focuses on improving the sampling strategy to achieve better performance Pareto. We also propose algorithms to efficiently and effectively identify the networks on the Pareto during training. Without extra re-training or post-processing, we can simultaneously obtain a large number of networks across a wide range of FLOPs. Our discovered model family, AttentiveNAS models, achieves top-1 accuracy from 77.3% to 80.7% on ImageNet, and outperforms SOTA models, including BigNAS and Once-for-All networks. We also achieve ImageNet accuracy of 80.1% with only 491 MFLOPs. Our training code and pretrained models are available at https://github.com/facebookresearch/AttentiveNAS.

Results

TaskDatasetMetricValueModel
Neural Architecture SearchImageNetAccuracy80.1AttentiveNAS-A5
Neural Architecture SearchImageNetTop-1 Error Rate19.9AttentiveNAS-A5
Neural Architecture SearchImageNetAccuracy79.8AttentiveNAS-A4
Neural Architecture SearchImageNetTop-1 Error Rate20.2AttentiveNAS-A4
Neural Architecture SearchImageNetAccuracy79.1AttentiveNAS-A3
Neural Architecture SearchImageNetTop-1 Error Rate20.9AttentiveNAS-A3
Neural Architecture SearchImageNetAccuracy78.8AttentiveNAS-A2
Neural Architecture SearchImageNetTop-1 Error Rate21.2AttentiveNAS-A2
Neural Architecture SearchImageNetAccuracy78.4AttentiveNAS-A1
Neural Architecture SearchImageNetTop-1 Error Rate21.6AttentiveNAS-A1
Neural Architecture SearchImageNetAccuracy77.3AttentiveNAS-A0
Neural Architecture SearchImageNetTop-1 Error Rate22.7AttentiveNAS-A0
AutoMLImageNetAccuracy80.1AttentiveNAS-A5
AutoMLImageNetTop-1 Error Rate19.9AttentiveNAS-A5
AutoMLImageNetAccuracy79.8AttentiveNAS-A4
AutoMLImageNetTop-1 Error Rate20.2AttentiveNAS-A4
AutoMLImageNetAccuracy79.1AttentiveNAS-A3
AutoMLImageNetTop-1 Error Rate20.9AttentiveNAS-A3
AutoMLImageNetAccuracy78.8AttentiveNAS-A2
AutoMLImageNetTop-1 Error Rate21.2AttentiveNAS-A2
AutoMLImageNetAccuracy78.4AttentiveNAS-A1
AutoMLImageNetTop-1 Error Rate21.6AttentiveNAS-A1
AutoMLImageNetAccuracy77.3AttentiveNAS-A0
AutoMLImageNetTop-1 Error Rate22.7AttentiveNAS-A0

Related Papers

DASViT: Differentiable Architecture Search for Vision Transformer2025-07-17AnalogNAS-Bench: A NAS Benchmark for Analog In-Memory Computing2025-06-23From Tiny Machine Learning to Tiny Deep Learning: A Survey2025-06-21One-Shot Neural Architecture Search with Network Similarity Directed Initialization for Pathological Image Classification2025-06-17DDS-NAS: Dynamic Data Selection within Neural Architecture Search via On-line Hard Example Mining applied to Image Classification2025-06-17MARCO: Hardware-Aware Neural Architecture Search for Edge Devices with Multi-Agent Reinforcement Learning and Conformal Prediction Filtering2025-06-16Finding Optimal Kernel Size and Dimension in Convolutional Neural Networks An Architecture Optimization Approach2025-06-16Directed Acyclic Graph Convolutional Networks2025-06-13