TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Training Graph Neural Networks with 1000 Layers

Training Graph Neural Networks with 1000 Layers

Guohao Li, Matthias Müller, Bernard Ghanem, Vladlen Koltun

2021-06-14Graph SamplingNode Property Prediction
PaperPDFCodeCode(official)CodeCode

Abstract

Deep graph neural networks (GNNs) have achieved excellent results on various tasks on increasingly large graph datasets with millions of nodes and edges. However, memory complexity has become a major obstacle when training deep GNNs for practical applications due to the immense number of nodes, edges, and intermediate activations. To improve the scalability of GNNs, prior works propose smart graph sampling or partitioning strategies to train GNNs with a smaller set of nodes or sub-graphs. In this work, we study reversible connections, group convolutions, weight tying, and equilibrium models to advance the memory and parameter efficiency of GNNs. We find that reversible connections in combination with deep network architectures enable the training of overparameterized GNNs that significantly outperform existing methods on multiple datasets. Our models RevGNN-Deep (1001 layers with 80 channels each) and RevGNN-Wide (448 layers with 224 channels each) were both trained on a single commodity GPU and achieve an ROC-AUC of $87.74 \pm 0.13$ and $88.24 \pm 0.15$ on the ogbn-proteins dataset. To the best of our knowledge, RevGNN-Deep is the deepest GNN in the literature by one order of magnitude. Please visit our project website https://www.deepgcns.org/arch/gnn1000 for more information.

Results

TaskDatasetMetricValueModel
Node Property Predictionogbn-arxivNumber of params2098256RevGAT+N.Adj+LabelReuse+SelfKD
Node Property Predictionogbn-arxivNumber of params2098256RevGAT+NormAdj+LabelReuse
Node Property Predictionogbn-productsNumber of params2945007RevGNN-112
Node Property Predictionogbn-proteinsNumber of params68471608RevGNN-Wide
Node Property Predictionogbn-proteinsNumber of params20031384RevGNN-Deep

Related Papers

Equivariance Everywhere All At Once: A Recipe for Graph Foundation Models2025-06-17Keyed Chaotic Dynamics for Privacy-Preserving Neural Inference2025-05-29Simple yet Effective Graph Distillation via Clustering2025-05-27Beyond Self-Repellent Kernels: History-Driven Target Towards Efficient Nonlinear MCMC on General Graphs2025-05-23The Limits of Graph Samplers for Training Inductive Recommender Systems: Extended results2025-05-20Graph Learning at Scale: Characterizing and Optimizing Pre-Propagation GNNs2025-04-17Distributed Graph Neural Network Inference With Just-In-Time Compilation For Industry-Scale Graphs2025-03-08Hierarchical graph sampling based minibatch learning with chain preservation and variance reduction2025-03-02