TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/FBNetV5: Neural Architecture Search for Multiple Tasks in ...

FBNetV5: Neural Architecture Search for Multiple Tasks in One Run

Bichen Wu, Chaojian Li, Hang Zhang, Xiaoliang Dai, Peizhao Zhang, Matthew Yu, Jialiang Wang, Yingyan Celine Lin, Peter Vajda

2021-11-19Image ClassificationSemantic SegmentationNeural Architecture SearchClassificationobject-detectionObject Detection
PaperPDF

Abstract

Neural Architecture Search (NAS) has been widely adopted to design accurate and efficient image classification models. However, applying NAS to a new computer vision task still requires a huge amount of effort. This is because 1) previous NAS research has been over-prioritized on image classification while largely ignoring other tasks; 2) many NAS works focus on optimizing task-specific components that cannot be favorably transferred to other tasks; and 3) existing NAS methods are typically designed to be "proxyless" and require significant effort to be integrated with each new task's training pipelines. To tackle these challenges, we propose FBNetV5, a NAS framework that can search for neural architectures for a variety of vision tasks with much reduced computational cost and human effort. Specifically, we design 1) a search space that is simple yet inclusive and transferable; 2) a multitask search process that is disentangled with target tasks' training pipeline; and 3) an algorithm to simultaneously search for architectures for multiple tasks with a computational cost agnostic to the number of tasks. We evaluate the proposed FBNetV5 targeting three fundamental vision tasks -- image classification, object detection, and semantic segmentation. Models searched by FBNetV5 in a single run of search have outperformed the previous stateof-the-art in all the three tasks: image classification (e.g., +1.3% ImageNet top-1 accuracy under the same FLOPs as compared to FBNetV3), semantic segmentation (e.g., +1.8% higher ADE20K val. mIoU than SegFormer with 3.6x fewer FLOPs), and object detection (e.g., +1.1% COCO val. mAP with 1.2x fewer FLOPs as compared to YOLOX).

Results

TaskDatasetMetricValueModel
Semantic SegmentationADE20KValidation mIoU40.4FBNetV5
Neural Architecture SearchImageNetTop-1 Error Rate18.2FBNetV5
Neural Architecture SearchImageNetAccuracy81.7FBNetV5-A-CLS
Neural Architecture SearchImageNetTop-1 Error Rate18.3FBNetV5-A-CLS
Neural Architecture SearchImageNetAccuracy77.2FBNetV5-AR-CLS
Neural Architecture SearchImageNetTop-1 Error Rate22.8FBNetV5-AR-CLS
Image ClassificationImageNetGFLOPs2.1FBNetV5-F-CLS
Image ClassificationImageNetGFLOPs1FBNetV5-C-CLS
Image ClassificationImageNetGFLOPs0.726FBNetV5
Image ClassificationImageNetGFLOPs0.685FBNetV5-A-CLS
Image ClassificationImageNetGFLOPs0.28FBNetV5-AC-CLS
Image ClassificationImageNetGFLOPs0.215FBNetV5-AR-CLS
AutoMLImageNetTop-1 Error Rate18.2FBNetV5
AutoMLImageNetAccuracy81.7FBNetV5-A-CLS
AutoMLImageNetTop-1 Error Rate18.3FBNetV5-A-CLS
AutoMLImageNetAccuracy77.2FBNetV5-AR-CLS
AutoMLImageNetTop-1 Error Rate22.8FBNetV5-AR-CLS
10-shot image generationADE20KValidation mIoU40.4FBNetV5

Related Papers

SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction2025-07-21Automatic Classification and Segmentation of Tunnel Cracks Based on Deep Learning and Visual Explanations2025-07-18Adversarial attacks to image classification systems using evolutionary algorithms2025-07-17Efficient Adaptation of Pre-trained Vision Transformer underpinned by Approximately Orthogonal Fine-Tuning Strategy2025-07-17Federated Learning for Commercial Image Sources2025-07-17MUPAX: Multidimensional Problem Agnostic eXplainable AI2025-07-17DiffOSeg: Omni Medical Image Segmentation via Multi-Expert Collaboration Diffusion Model2025-07-17SCORE: Scene Context Matters in Open-Vocabulary Remote Sensing Instance Segmentation2025-07-17