Do Transformers Really Perform Bad for Graph Representation?

Chengxuan Ying, Tianle Cai, Shengjie Luo, Shuxin Zheng, Guolin Ke, Di He, Yanming Shen, Tie-Yan Liu

2021-06-09Molecular Property Prediction Graph Representation Learning Representation Learning Graph Regression Graph Classification Graph Property Prediction

Paper PDF Code Code Code Code Code(official)

Abstract

The Transformer architecture has become a dominant choice in many domains, such as natural language processing and computer vision. Yet, it has not achieved competitive performance on popular leaderboards of graph-level prediction compared to mainstream GNN variants. Therefore, it remains a mystery how Transformers could perform well for graph representation learning. In this paper, we solve this mystery by presenting Graphormer, which is built upon the standard Transformer architecture, and could attain excellent results on a broad range of graph representation learning tasks, especially on the recent OGB Large-Scale Challenge. Our key insight to utilizing Transformer in the graph is the necessity of effectively encoding the structural information of a graph into the model. To this end, we propose several simple yet effective structural encoding methods to help Graphormer better model graph-structured data. Besides, we mathematically characterize the expressive power of Graphormer and exhibit that with our ways of encoding the structural information of graphs, many popular GNN variants could be covered as the special cases of Graphormer.

Results

Task	Dataset	Metric	Value	Model
Graph Regression	PCQM4Mv2-LSC	Validation MAE	0.0864	Graphormer
Graph Regression	ZINC-500k	MAE	0.122	Graphormer-SLIM
Graph Regression	PCQM4M-LSC	Test MAE	13.28	Graphormer
Graph Regression	PCQM4M-LSC	Validation MAE	0.1234	Graphormer
Graph Property Prediction	ogbg-molhiv	Number of params	47085378	Graphormer + FPs
Graph Property Prediction	ogbg-molhiv	Number of params	47183040	Graphormer
Graph Property Prediction	ogbg-molhiv	Number of params	47183040	Graphormer (pre-trained on PCQM4M)
Graph Property Prediction	ogbg-molpcba	Number of params	119529664	Graphormer
Graph Property Prediction	ogbg-molpcba	Number of params	119529664	Graphormer (pre-trained on PCQM4M)

Do Transformers Really Perform Bad for Graph Representation?

Abstract

Results

Related Papers

Do Transformers Really Perform Bad for Graph Representation?

Abstract

Results

Related Papers