TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Parameter Prediction for Unseen Deep Architectures

Parameter Prediction for Unseen Deep Architectures

Boris Knyazev, Michal Drozdzal, Graham W. Taylor, Adriana Romero-Soriano

2021-10-25NeurIPS 2021 12Parameter PredictionPrediction
PaperPDFCode(official)

Abstract

Deep learning has been successful in automating the design of features in machine learning pipelines. However, the algorithms optimizing neural network parameters remain largely hand-designed and computationally inefficient. We study if we can use deep learning to directly predict these parameters by exploiting the past knowledge of training other networks. We introduce a large-scale dataset of diverse computational graphs of neural architectures - DeepNets-1M - and use it to explore parameter prediction on CIFAR-10 and ImageNet. By leveraging advances in graph neural networks, we propose a hypernetwork that can predict performant parameters in a single forward pass taking a fraction of a second, even on a CPU. The proposed model achieves surprisingly good performance on unseen and diverse networks. For example, it is able to predict all 24 million parameters of a ResNet-50 achieving a 60% accuracy on CIFAR-10. On ImageNet, top-5 accuracy of some of our networks approaches 50%. Our task along with the model and results can potentially lead to a new, more computationally efficient paradigm of training networks. Our model also learns a strong representation of neural architectures enabling their analysis.

Results

TaskDatasetMetricValueModel
Parameter PredictionCIFAR10Classification Accuracy (BN-free)36.8GHN-2
Parameter PredictionCIFAR10Classification Accuracy (Deep)60.5GHN-2
Parameter PredictionCIFAR10Classification Accuracy (Dense)65.8GHN-2
Parameter PredictionCIFAR10Classification Accuracy (ID-test)66.9GHN-2
Parameter PredictionCIFAR10Classification Accuracy (ResNet-50)58.6GHN-2
Parameter PredictionCIFAR10Classification Accuracy (ViT)11.4GHN-2
Parameter PredictionCIFAR10Classification Accuracy (Wide)64GHN-2

Related Papers

Multi-Strategy Improved Snake Optimizer Accelerated CNN-LSTM-Attention-Adaboost for Trajectory Prediction2025-07-21Generative Click-through Rate Prediction with Applications to Search Advertising2025-07-15Conformation-Aware Structure Prediction of Antigen-Recognizing Immune Proteins2025-07-11Foundation models for time series forecasting: Application in conformal prediction2025-07-09Predicting Graph Structure via Adapted Flux Balance Analysis2025-07-08Speech Quality Assessment Model Based on Mixture of Experts: System-Level Performance Enhancement and Utterance-Level Challenge Analysis2025-07-08A Wireless Foundation Model for Multi-Task Prediction2025-07-08High Order Collaboration-Oriented Federated Graph Neural Network for Accurate QoS Prediction2025-07-07