Zhe Chen, Hao Tan, Tao Wang, Tianrun Shen, Tong Lu, Qiuying Peng, Cheng Cheng, Yue Qi
This paper presents a novel transformer architecture for graph representation learning. The core insight of our method is to fully consider the information propagation among nodes and edges in a graph when building the attention module in the transformer blocks. Specifically, we propose a new attention mechanism called Graph Propagation Attention (GPA). It explicitly passes the information among nodes and edges in three ways, i.e. node-to-node, node-to-edge, and edge-to-node, which is essential for learning graph-structured data. On this basis, we design an effective transformer architecture named Graph Propagation Transformer (GPTrans) to further help learn graph data. We verify the performance of GPTrans in a wide range of graph learning experiments on several benchmark datasets. These results show that our method outperforms many state-of-the-art transformer-based graph models with better performance. The code will be released at https://github.com/czczup/GPTrans.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Graph Regression | PCQM4Mv2-LSC | Test MAE | 0.0821 | GPTrans-L |
| Graph Regression | PCQM4Mv2-LSC | Validation MAE | 0.0809 | GPTrans-L |
| Graph Regression | PCQM4Mv2-LSC | Test MAE | 0.0842 | GPTrans-T |
| Graph Regression | PCQM4Mv2-LSC | Validation MAE | 0.0833 | GPTrans-T |
| Graph Regression | ZINC-500k | MAE | 0.077 | GPTrans-Nano |
| Graph Regression | PCQM4M-LSC | Validation MAE | 0.1151 | GPTrans-L |
| Node Classification | CLUSTER | Accuracy | 78.07 | GPTrans-Nano |