TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/BossNAS: Exploring Hybrid CNN-transformers with Block-wise...

BossNAS: Exploring Hybrid CNN-transformers with Block-wisely Self-supervised Neural Architecture Search

Changlin Li, Tao Tang, Guangrun Wang, Jiefeng Peng, Bing Wang, Xiaodan Liang, Xiaojun Chang

2021-03-23ICCV 2021 10Image ClassificationOpen-Ended Question AnsweringNeural Architecture Search
PaperPDFCode(official)

Abstract

A myriad of recent breakthroughs in hand-crafted neural architectures for visual recognition have highlighted the urgent need to explore hybrid architectures consisting of diversified building blocks. Meanwhile, neural architecture search methods are surging with an expectation to reduce human efforts. However, whether NAS methods can efficiently and effectively handle diversified search spaces with disparate candidates (e.g. CNNs and transformers) is still an open question. In this work, we present Block-wisely Self-supervised Neural Architecture Search (BossNAS), an unsupervised NAS method that addresses the problem of inaccurate architecture rating caused by large weight-sharing space and biased supervision in previous methods. More specifically, we factorize the search space into blocks and utilize a novel self-supervised training scheme, named ensemble bootstrapping, to train each block separately before searching them as a whole towards the population center. Additionally, we present HyTra search space, a fabric-like hybrid CNN-transformer search space with searchable down-sampling positions. On this challenging search space, our searched model, BossNet-T, achieves up to 82.5% accuracy on ImageNet, surpassing EfficientNet by 2.4% with comparable compute time. Moreover, our method achieves superior architecture rating accuracy with 0.78 and 0.76 Spearman correlation on the canonical MBConv search space with ImageNet and on NATS-Bench size search space with CIFAR-100, respectively, surpassing state-of-the-art NAS methods. Code: https://github.com/changlin31/BossNAS

Results

TaskDatasetMetricValueModel
Neural Architecture SearchNATS-Bench Size, CIFAR-10Acc. (test)93.29BossNAS
Neural Architecture SearchNATS-Bench Size, CIFAR-10Kendall's Tau0.53BossNAS
Neural Architecture SearchNATS-Bench Size, CIFAR-10Pearson R0.72BossNAS
Neural Architecture SearchNATS-Bench Size, CIFAR-10Spearman's Rho0.73BossNAS
Neural Architecture SearchNATS-Bench Size, CIFAR-100Acc. (test)70.86BossNAS
Neural Architecture SearchNATS-Bench Size, CIFAR-100Kendall's Tau0.59BossNAS
Neural Architecture SearchNATS-Bench Size, CIFAR-100Pearson R0.79BossNAS
Neural Architecture SearchNATS-Bench Size, CIFAR-100Spearman's Rho0.76BossNAS
Neural Architecture SearchImageNetAccuracy82.2BossNet-T1+
Neural Architecture SearchImageNetTop-1 Error Rate17.8BossNet-T1+
Image ClassificationImageNetGFLOPs15.8BossNet-T1
AutoMLNATS-Bench Size, CIFAR-10Acc. (test)93.29BossNAS
AutoMLNATS-Bench Size, CIFAR-10Kendall's Tau0.53BossNAS
AutoMLNATS-Bench Size, CIFAR-10Pearson R0.72BossNAS
AutoMLNATS-Bench Size, CIFAR-10Spearman's Rho0.73BossNAS
AutoMLNATS-Bench Size, CIFAR-100Acc. (test)70.86BossNAS
AutoMLNATS-Bench Size, CIFAR-100Kendall's Tau0.59BossNAS
AutoMLNATS-Bench Size, CIFAR-100Pearson R0.79BossNAS
AutoMLNATS-Bench Size, CIFAR-100Spearman's Rho0.76BossNAS
AutoMLImageNetAccuracy82.2BossNet-T1+
AutoMLImageNetTop-1 Error Rate17.8BossNet-T1+

Related Papers

Automatic Classification and Segmentation of Tunnel Cracks Based on Deep Learning and Visual Explanations2025-07-18Adversarial attacks to image classification systems using evolutionary algorithms2025-07-17Efficient Adaptation of Pre-trained Vision Transformer underpinned by Approximately Orthogonal Fine-Tuning Strategy2025-07-17Federated Learning for Commercial Image Sources2025-07-17MUPAX: Multidimensional Problem Agnostic eXplainable AI2025-07-17DASViT: Differentiable Architecture Search for Vision Transformer2025-07-17Hashed Watermark as a Filter: Defeating Forging and Overwriting Attacks in Weight-based Neural Network Watermarking2025-07-15Transferring Styles for Reduced Texture Bias and Improved Robustness in Semantic Segmentation Networks2025-07-14