TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/An end-to-end attention-based approach for learning on gra...

An end-to-end attention-based approach for learning on graphs

David Buterez, Jon Paul Janet, Dino Oglic, Pietro Lio

2024-02-16Molecular Property PredictionTransfer LearningGraph RegressionGraph ClassificationNode Classification
PaperPDFCode

Abstract

There has been a recent surge in transformer-based architectures for learning on graphs, mainly motivated by attention as an effective learning mechanism and the desire to supersede handcrafted operators characteristic of message passing schemes. However, concerns over their empirical effectiveness, scalability, and complexity of the pre-processing steps have been raised, especially in relation to much simpler graph neural networks that typically perform on par with them across a wide range of benchmarks. To tackle these shortcomings, we consider graphs as sets of edges and propose a purely attention-based approach consisting of an encoder and an attention pooling mechanism. The encoder vertically interleaves masked and vanilla self-attention modules to learn an effective representations of edges, while allowing for tackling possible misspecifications in input graphs. Despite its simplicity, the approach outperforms fine-tuned message passing baselines and recently proposed transformer-based methods on more than 70 node and graph-level tasks, including challenging long-range benchmarks. Moreover, we demonstrate state-of-the-art performance across different tasks, ranging from molecular to vision graphs, and heterophilous node classification. The approach also outperforms graph neural networks and transformers in transfer learning settings, and scales much better than alternatives with a similar performance level or expressive power.

Results

TaskDatasetMetricValueModel
Graph RegressionZINCMAE0.051ESA + rings + NodeRWSE + EdgeRWSE
Graph RegressionPCQM4Mv2-LSCValidation MAE0.0235ESA (Edge set attention, no positional encodings)
Graph RegressionZINC-500kMAE0.051ESA + rings + NodeRWSE + EdgeRWSE
Graph ClassificationPeptides-funcAP0.7479ESA + RWSE (Edge set attention, Random Walk Structural Encoding, + validation set)
ClassificationPeptides-funcAP0.7479ESA + RWSE (Edge set attention, Random Walk Structural Encoding, + validation set)

Related Papers

RaMen: Multi-Strategy Multi-Modal Learning for Bundle Construction2025-07-18Disentangling coincident cell events using deep transfer learning and compressive sensing2025-07-17Best Practices for Large-Scale, Pixel-Wise Crop Mapping and Transfer Learning Workflows2025-07-16Robust-Multi-Task Gradient Boosting2025-07-15Calibrated and Robust Foundation Models for Vision-Language and Medical Image Tasks Under Distribution Shift2025-07-12The Bayesian Approach to Continual Learning: An Overview2025-07-11Contrastive and Transfer Learning for Effective Audio Fingerprinting through a Real-World Evaluation Protocol2025-07-08Agent KB: Leveraging Cross-Domain Experience for Agentic Problem Solving2025-07-08