TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Do Transformers Really Perform Bad for Graph Representation?

Do Transformers Really Perform Bad for Graph Representation?

Chengxuan Ying, Tianle Cai, Shengjie Luo, Shuxin Zheng, Guolin Ke, Di He, Yanming Shen, Tie-Yan Liu

2021-06-09Molecular Property PredictionGraph Representation LearningRepresentation LearningGraph RegressionGraph ClassificationGraph Property Prediction
PaperPDFCodeCodeCodeCodeCode(official)

Abstract

The Transformer architecture has become a dominant choice in many domains, such as natural language processing and computer vision. Yet, it has not achieved competitive performance on popular leaderboards of graph-level prediction compared to mainstream GNN variants. Therefore, it remains a mystery how Transformers could perform well for graph representation learning. In this paper, we solve this mystery by presenting Graphormer, which is built upon the standard Transformer architecture, and could attain excellent results on a broad range of graph representation learning tasks, especially on the recent OGB Large-Scale Challenge. Our key insight to utilizing Transformer in the graph is the necessity of effectively encoding the structural information of a graph into the model. To this end, we propose several simple yet effective structural encoding methods to help Graphormer better model graph-structured data. Besides, we mathematically characterize the expressive power of Graphormer and exhibit that with our ways of encoding the structural information of graphs, many popular GNN variants could be covered as the special cases of Graphormer.

Results

TaskDatasetMetricValueModel
Graph RegressionPCQM4Mv2-LSCValidation MAE0.0864Graphormer
Graph RegressionZINC-500kMAE0.122Graphormer-SLIM
Graph RegressionPCQM4M-LSCTest MAE13.28Graphormer
Graph RegressionPCQM4M-LSCValidation MAE0.1234Graphormer
Graph Property Predictionogbg-molhivNumber of params47085378Graphormer + FPs
Graph Property Predictionogbg-molhivNumber of params47183040Graphormer
Graph Property Predictionogbg-molhivNumber of params47183040Graphormer (pre-trained on PCQM4M)
Graph Property Predictionogbg-molpcbaNumber of params119529664Graphormer
Graph Property Predictionogbg-molpcbaNumber of params119529664Graphormer (pre-trained on PCQM4M)

Related Papers

Touch in the Wild: Learning Fine-Grained Manipulation with a Portable Visuo-Tactile Gripper2025-07-20SMART: Relation-Aware Learning of Geometric Representations for Knowledge Graphs2025-07-17Spectral Bellman Method: Unifying Representation and Exploration in RL2025-07-17Boosting Team Modeling through Tempo-Relational Representation Learning2025-07-17Similarity-Guided Diffusion for Contrastive Sequential Recommendation2025-07-16Are encoders able to learn landmarkers for warm-starting of Hyperparameter Optimization?2025-07-16Language-Guided Contrastive Audio-Visual Masked Autoencoder with Automatically Generated Audio-Visual-Text Triplets from Videos2025-07-16A Mixed-Primitive-based Gaussian Splatting Method for Surface Reconstruction2025-07-15