TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Generalized Latency Performance Estimation for Once-For-Al...

Generalized Latency Performance Estimation for Once-For-All Neural Architecture Search

Muhtadyuzzaman Syed, Arvind Akpuram Srinivasan

2021-01-04Neural Architecture SearchAll
PaperPDFCodeCode

Abstract

Neural Architecture Search (NAS) has enabled the possibility of automated machine learning by streamlining the manual development of deep neural network architectures defining a search space, search strategy, and performance estimation strategy. To solve the need for multi-platform deployment of Convolutional Neural Network (CNN) models, Once-For-All (OFA) proposed to decouple Training and Search to deliver a one-shot model of sub-networks that are constrained to various accuracy-latency tradeoffs. We find that the performance estimation strategy for OFA's search severely lacks generalizability of different hardware deployment platforms due to single hardware latency lookup tables that require significant amount of time and manual effort to build beforehand. In this work, we demonstrate the framework for building latency predictors for neural network architectures to address the need for heterogeneous hardware support and reduce the overhead of lookup tables altogether. We introduce two generalizability strategies which include fine-tuning using a base model trained on a specific hardware and NAS search space, and GPU-generalization which trains a model on GPU hardware parameters such as Number of Cores, RAM Size, and Memory Bandwidth. With this, we provide a family of latency prediction models that achieve over 50% lower RMSE loss as compared to with ProxylessNAS. We also show that the use of these latency predictors match the NAS performance of the lookup table baseline approach if not exceeding it in certain cases.

Related Papers

DASViT: Differentiable Architecture Search for Vision Transformer2025-07-17Modeling Code: Is Text All You Need?2025-07-15All Eyes, no IMU: Learning Flight Attitude from Vision Alone2025-07-15Is Diversity All You Need for Scalable Robotic Manipulation?2025-07-08DESIGN AND IMPLEMENTATION OF ONLINE CLEARANCE REPORT.2025-07-07Is Reasoning All You Need? Probing Bias in the Age of Reasoning Language Models2025-07-03Prompt2SegCXR:Prompt to Segment All Organs and Diseases in Chest X-rays2025-07-01State and Memory is All You Need for Robust and Reliable AI Agents2025-06-30