TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/AlphaNet: Improved Training of Supernets with Alpha-Diverg...

AlphaNet: Improved Training of Supernets with Alpha-Divergence

Dilin Wang, Chengyue Gong, Meng Li, Qiang Liu, Vikas Chandra

2021-02-16Image ClassificationNeural Architecture Search
PaperPDFCode(official)Code

Abstract

Weight-sharing neural architecture search (NAS) is an effective technique for automating efficient neural architecture design. Weight-sharing NAS builds a supernet that assembles all the architectures as its sub-networks and jointly trains the supernet with the sub-networks. The success of weight-sharing NAS heavily relies on distilling the knowledge of the supernet to the sub-networks. However, we find that the widely used distillation divergence, i.e., KL divergence, may lead to student sub-networks that over-estimate or under-estimate the uncertainty of the teacher supernet, leading to inferior performance of the sub-networks. In this work, we propose to improve the supernet training with a more generalized alpha-divergence. By adaptively selecting the alpha-divergence, we simultaneously prevent the over-estimation or under-estimation of the uncertainty of the teacher model. We apply the proposed alpha-divergence based supernets training to both slimmable neural networks and weight-sharing NAS, and demonstrate significant improvements. Specifically, our discovered model family, AlphaNet, outperforms prior-art models on a wide range of FLOPs regimes, including BigNAS, Once-for-All networks, and AttentiveNAS. We achieve ImageNet top-1 accuracy of 80.0% with only 444M FLOPs. Our code and pretrained models are available at https://github.com/facebookresearch/AlphaNet.

Results

TaskDatasetMetricValueModel
Neural Architecture SearchImageNetAccuracy80.8AlphaNet-A6
Neural Architecture SearchImageNetTop-1 Error Rate19.2AlphaNet-A6
Neural Architecture SearchImageNetAccuracy80.6AlphaNet-A5 (base)
Neural Architecture SearchImageNetTop-1 Error Rate19.4AlphaNet-A5 (base)
Neural Architecture SearchImageNetAccuracy80.3AlphaNet-A5 (small)
Neural Architecture SearchImageNetTop-1 Error Rate19.7AlphaNet-A5 (small)
Neural Architecture SearchImageNetAccuracy80AlphaNet-A4
Neural Architecture SearchImageNetTop-1 Error Rate20AlphaNet-A4
Neural Architecture SearchImageNetAccuracy79.4AlphaNet-A3
Neural Architecture SearchImageNetTop-1 Error Rate20.6AlphaNet-A3
Neural Architecture SearchImageNetAccuracy79.2AlphaNet-A2
Neural Architecture SearchImageNetTop-1 Error Rate20.8AlphaNet-A2
Neural Architecture SearchImageNetAccuracy79AlphaNet-A1
Neural Architecture SearchImageNetTop-1 Error Rate21AlphaNet-A1
Neural Architecture SearchImageNetAccuracy77.9AlphaNet-A0
Neural Architecture SearchImageNetTop-1 Error Rate22.1AlphaNet-A0
Image ClassificationImageNetGFLOPs0.709AlphaNet-A6
Image ClassificationImageNetGFLOPs0.491AlphaNet-A5
Image ClassificationImageNetGFLOPs0.444AlphaNet-A4
Image ClassificationImageNetGFLOPs0.357AlphaNet-A3
Image ClassificationImageNetGFLOPs0.317AlphaNet-A2
Image ClassificationImageNetGFLOPs0.279AlphaNet-A1
Image ClassificationImageNetGFLOPs0.203AlphaNet-A0
AutoMLImageNetAccuracy80.8AlphaNet-A6
AutoMLImageNetTop-1 Error Rate19.2AlphaNet-A6
AutoMLImageNetAccuracy80.6AlphaNet-A5 (base)
AutoMLImageNetTop-1 Error Rate19.4AlphaNet-A5 (base)
AutoMLImageNetAccuracy80.3AlphaNet-A5 (small)
AutoMLImageNetTop-1 Error Rate19.7AlphaNet-A5 (small)
AutoMLImageNetAccuracy80AlphaNet-A4
AutoMLImageNetTop-1 Error Rate20AlphaNet-A4
AutoMLImageNetAccuracy79.4AlphaNet-A3
AutoMLImageNetTop-1 Error Rate20.6AlphaNet-A3
AutoMLImageNetAccuracy79.2AlphaNet-A2
AutoMLImageNetTop-1 Error Rate20.8AlphaNet-A2
AutoMLImageNetAccuracy79AlphaNet-A1
AutoMLImageNetTop-1 Error Rate21AlphaNet-A1
AutoMLImageNetAccuracy77.9AlphaNet-A0
AutoMLImageNetTop-1 Error Rate22.1AlphaNet-A0

Related Papers

Automatic Classification and Segmentation of Tunnel Cracks Based on Deep Learning and Visual Explanations2025-07-18Adversarial attacks to image classification systems using evolutionary algorithms2025-07-17Efficient Adaptation of Pre-trained Vision Transformer underpinned by Approximately Orthogonal Fine-Tuning Strategy2025-07-17Federated Learning for Commercial Image Sources2025-07-17MUPAX: Multidimensional Problem Agnostic eXplainable AI2025-07-17DASViT: Differentiable Architecture Search for Vision Transformer2025-07-17Hashed Watermark as a Filter: Defeating Forging and Overwriting Attacks in Weight-based Neural Network Watermarking2025-07-15Transferring Styles for Reduced Texture Bias and Improved Robustness in Semantic Segmentation Networks2025-07-14