TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Global Self-Attention as a Replacement for Graph Convolution

Global Self-Attention as a Replacement for Graph Convolution

Md Shamim Hussain, Mohammed J. Zaki, Dharmashankar Subramanian

2021-08-07Transfer LearningGraph RegressionGraph ClassificationGraph LearningNode ClassificationEdge ClassificationGraph Property PredictionLink Prediction
PaperPDFCode(official)Code(official)Code(official)

Abstract

We propose an extension to the transformer neural network architecture for general-purpose graph learning by adding a dedicated pathway for pairwise structural information, called edge channels. The resultant framework - which we call Edge-augmented Graph Transformer (EGT) - can directly accept, process and output structural information of arbitrary form, which is important for effective learning on graph-structured data. Our model exclusively uses global self-attention as an aggregation mechanism rather than static localized convolutional aggregation. This allows for unconstrained long-range dynamic interactions between nodes. Moreover, the edge channels allow the structural information to evolve from layer to layer, and prediction tasks on edges/links can be performed directly from the output embeddings of these channels. We verify the performance of EGT in a wide range of graph-learning experiments on benchmark datasets, in which it outperforms Convolutional/Message-Passing Graph Neural Networks. EGT sets a new state-of-the-art for the quantum-chemical regression task on the OGB-LSC PCQM4Mv2 dataset containing 3.8 million molecular graphs. Our findings indicate that global self-attention based aggregation can serve as a flexible, adaptive and effective replacement of graph convolution for general-purpose graph learning. Therefore, convolutional local neighborhood aggregation is not an essential inductive bias.

Results

TaskDatasetMetricValueModel
Link PredictionTSP/HCP Benchmark setF10.853EGT
Graph RegressionPCQM4Mv2-LSCTest MAE0.0683EGT + Triangular Attention
Graph RegressionPCQM4Mv2-LSCValidation MAE0.0671EGT + Triangular Attention
Graph RegressionPCQM4Mv2-LSCTest MAE0.0862EGT
Graph RegressionPCQM4Mv2-LSCValidation MAE0.0857EGT
Graph RegressionZINC-500kMAE0.108EGT
Graph RegressionZINC 100kMAE0.143EGT
Graph RegressionPCQM4M-LSCValidation MAE0.1224EGT
Graph ClassificationMNISTAccuracy98.173EGT
Graph ClassificationCIFAR10 100kAccuracy (%)68.702EGT
Node ClassificationPATTERNAccuracy86.821EGT
Node ClassificationCLUSTERAccuracy79.232EGT
Node ClassificationPATTERN 100kAccuracy (%)86.816EGT
ClassificationMNISTAccuracy98.173EGT
ClassificationCIFAR10 100kAccuracy (%)68.702EGT

Related Papers

RaMen: Multi-Strategy Multi-Modal Learning for Bundle Construction2025-07-18Disentangling coincident cell events using deep transfer learning and compressive sensing2025-07-17SGCL: Unifying Self-Supervised and Supervised Learning for Graph Recommendation2025-07-17Best Practices for Large-Scale, Pixel-Wise Crop Mapping and Transfer Learning Workflows2025-07-16Robust-Multi-Task Gradient Boosting2025-07-15A Graph-in-Graph Learning Framework for Drug-Target Interaction Prediction2025-07-15Graph World Model2025-07-14Federated Learning with Graph-Based Aggregation for Traffic Forecasting2025-07-13