TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Graph Attention Networks

Graph Attention Networks

Petar Veličković, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Liò, Yoshua Bengio

2017-10-30ICLR 2018 1Molecular Property PredictionQuestion AnsweringNode Classification on Non-Homophilic (Heterophilic) GraphsSkeleton Based Action RecognitionHeterogeneous Node ClassificationGraph RegressionGraph ClassificationDocument ClassificationNode ClassificationNode Property PredictionGraph EmbeddingGraph AttentionLink Prediction
PaperPDFCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCode(official)CodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCode

Abstract

We present graph attention networks (GATs), novel neural network architectures that operate on graph-structured data, leveraging masked self-attentional layers to address the shortcomings of prior methods based on graph convolutions or their approximations. By stacking layers in which nodes are able to attend over their neighborhoods' features, we enable (implicitly) specifying different weights to different nodes in a neighborhood, without requiring any kind of costly matrix operation (such as inversion) or depending on knowing the graph structure upfront. In this way, we address several key challenges of spectral-based graph neural networks simultaneously, and make our model readily applicable to inductive as well as transductive problems. Our GAT models have achieved or matched state-of-the-art results across four established transductive and inductive graph benchmarks: the Cora, Citeseer and Pubmed citation network datasets, as well as a protein-protein interaction dataset (wherein test graphs remain unseen during training).

Results

TaskDatasetMetricValueModel
VideoJ-HMBD Early Action10%58.1GAT
Temporal Action LocalizationJ-HMBD Early Action10%58.1GAT
Zero-Shot LearningJ-HMBD Early Action10%58.1GAT
Activity RecognitionJ-HMBD Early Action10%58.1GAT
Action LocalizationJ-HMBD Early Action10%58.1GAT
Action DetectionJ-HMBD Early Action10%58.1GAT
3D Action RecognitionJ-HMBD Early Action10%58.1GAT
Graph RegressionZINC 100kMAE0.463GAT
Graph RegressionLipophilicityRMSE0.95GAT
Action RecognitionJ-HMBD Early Action10%58.1GAT
Graph ClassificationCIFAR10 100kAccuracy (%)65.48GAT
Node ClassificationBrazil Air-TrafficAccuracy0.382GAT (Velickovic et al., 2018)
Node ClassificationPPIF197.3GAT
Node ClassificationWiki-VoteAccuracy59.4GAT (Velickovic et al., 2018)
Node ClassificationPubmedF1-Score79GAT
Node ClassificationEurope Air-TrafficAccuracy42.4GAT (Velickovic et al., 2018)
Node ClassificationFlickrAccuracy0.359GAT (Velickovic et al., 2018)
Node ClassificationUSA Air-TrafficAccuracy58.5GAT (Velickovic et al., 2018)
Node ClassificationPATTERN 100kAccuracy (%)75.824GAT
Node ClassificationIMDB (Heterogeneous Node Classification)Macro-F158.94GAT
Node ClassificationIMDB (Heterogeneous Node Classification)Micro-F164.86GAT
Node ClassificationFreebase (Heterogeneous Node Classification)Accuracy65.26GAT
Node ClassificationFreebase (Heterogeneous Node Classification)Macro-F140.74GAT
Node ClassificationDBLP (Heterogeneous Node Classification)Macro-F193.83GAT
Node ClassificationDBLP (Heterogeneous Node Classification)Micro-F193.39GAT
Node ClassificationACM (Heterogeneous Node Classification)Macro-F192.26GAT
Node ClassificationACM (Heterogeneous Node Classification)Micro-F192.19GAT
Graph Property Predictionogbg-code2Number of params11030210GAT
ClassificationCIFAR10 100kAccuracy (%)65.48GAT
Node Property Predictionogbn-arxivNumber of params1441580GAT+label reuse+self KD
Node Property Predictionogbn-arxivNumber of params1441580GAT+label+reuse+topo loss
Node Property Predictionogbn-productsNumber of params751574GAT with NeighborSampling
Node Property Predictionogbn-proteinsNumber of params6360470GAT + labels + node2vec

Related Papers

From Roots to Rewards: Dynamic Tree Reasoning with RL2025-07-17Enter the Mind Palace: Reasoning and Planning for Long-term Active Embodied Question Answering2025-07-17Vision-and-Language Training Helps Deploy Taxonomic Knowledge but Does Not Fundamentally Alter It2025-07-17City-VLM: Towards Multidomain Perception Scene Understanding via Multimodal Incomplete Learning2025-07-17SMART: Relation-Aware Learning of Geometric Representations for Knowledge Graphs2025-07-17Describe Anything Model for Visual Question Answering on Text-rich Images2025-07-16Is This Just Fantasy? Language Model Representations Reflect Human Judgments of Event Plausibility2025-07-16Catching Bid-rigging Cartels with Graph Attention Neural Networks2025-07-16