TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Graph Inductive Biases in Transformers without Message Pas...

Graph Inductive Biases in Transformers without Message Passing

Liheng Ma, Chen Lin, Derek Lim, Adriana Romero-Soriano, Puneet K. Dokania, Mark Coates, Philip Torr, Ser-Nam Lim

2023-05-27Graph RegressionGraph ClassificationNode Classification
PaperPDFCodeCode(official)

Abstract

Transformers for graph data are increasingly widely studied and successful in numerous learning tasks. Graph inductive biases are crucial for Graph Transformers, and previous works incorporate them using message-passing modules and/or positional encodings. However, Graph Transformers that use message-passing inherit known issues of message-passing, and differ significantly from Transformers used in other domains, thus making transfer of research advances more difficult. On the other hand, Graph Transformers without message-passing often perform poorly on smaller datasets, where inductive biases are more crucial. To bridge this gap, we propose the Graph Inductive bias Transformer (GRIT) -- a new Graph Transformer that incorporates graph inductive biases without using message passing. GRIT is based on several architectural changes that are each theoretically and empirically justified, including: learned relative positional encodings initialized with random walk probabilities, a flexible attention mechanism that updates node and node-pair representations, and injection of degree information in each layer. We prove that GRIT is expressive -- it can express shortest path distances and various graph propagation matrices. GRIT achieves state-of-the-art empirical performance across a variety of graph datasets, thus showing the power that Graph Transformers without message-passing can deliver.

Results

TaskDatasetMetricValueModel
Graph RegressionZINC-fullTest MAE0.023GRIT
Graph RegressionZINCMAE0.059GRIT
Graph RegressionPCQM4Mv2-LSCValidation MAE0.0859GRIT
Graph RegressionZINC-500kMAE0.059GRIT
Graph ClassificationMNISTAccuracy98.108GRIT
Graph ClassificationCIFAR10 100kAccuracy (%)76.468GRIT
Node ClassificationPATTERNAccuracy87.196GRIT
Node ClassificationCLUSTERAccuracy80.026GRIT
ClassificationMNISTAccuracy98.108GRIT
ClassificationCIFAR10 100kAccuracy (%)76.468GRIT

Related Papers

Demystifying Distributed Training of Graph Neural Networks for Link Prediction2025-06-25Equivariance Everywhere All At Once: A Recipe for Graph Foundation Models2025-06-17Density-aware Walks for Coordinated Campaign Detection2025-06-16Delving into Instance-Dependent Label Noise in Graph Data: A Comprehensive Study and Benchmark2025-06-14Graph Semi-Supervised Learning for Point Classification on Data Manifolds2025-06-13Devil's Hand: Data Poisoning Attacks to Locally Private Graph Learning Protocols2025-06-11Wasserstein Hypergraph Neural Network2025-06-11Positional Encoding meets Persistent Homology on Graphs2025-06-06