Inductive Representation Learning on Large Graphs

William L. Hamilton, Rex Ying, Jure Leskovec

2017-06-07NeurIPS 2017 12Node Classification on Non-Homophilic (Heterophilic) Graphs Representation Learning Graph Regression Graph Classification Node Classification Link Property Prediction Node Property Prediction Link Prediction

Paper PDF Code Code Code Code Code Code Code Code Code Code Code Code Code Code Code Code(official)Code Code Code Code

Abstract

Low-dimensional embeddings of nodes in large graphs have proved extremely useful in a variety of prediction tasks, from content recommendation to identifying protein functions. However, most existing approaches require that all nodes in the graph are present during training of the embeddings; these previous approaches are inherently transductive and do not naturally generalize to unseen nodes. Here we present GraphSAGE, a general, inductive framework that leverages node feature information (e.g., text attributes) to efficiently generate node embeddings for previously unseen data. Instead of training individual embeddings for each node, we learn a function that generates embeddings by sampling and aggregating features from a node's local neighborhood. Our algorithm outperforms strong baselines on three inductive node-classification benchmarks: we classify the category of unseen nodes in evolving information graphs based on citation and Reddit post data, and we show that our algorithm generalizes to completely unseen graphs using a multi-graph dataset of protein-protein interactions.

Results

Task	Dataset	Metric	Value	Model
Graph Regression	ZINC-500k	MAE	0.398	GraphSage
Graph Classification	CIFAR10 100k	Accuracy (%)	66.08	GraphSage
Node Classification	Facebook	Accuracy	38.9	GraphSAGE (Hamilton et al., [2017a])
Node Classification	Brazil Air-Traffic	Accuracy	0.404	GraphSAGE (Hamilton et al., [2017a])
Node Classification	PPI	F1	61.2	GraphSAGE
Node Classification	Wiki-Vote	Accuracy	24.5	GraphSAGE (Hamilton et al., [2017a])
Node Classification	CiteSeer with Public Split: fixed 20 nodes per class	Accuracy	67.2	GraphSAGE
Node Classification	Europe Air-Traffic	Accuracy	27.2	GraphSAGE (Hamilton et al., [2017a])
Node Classification	Flickr	Accuracy	0.641	GraphSAGE (Hamilton et al., [2017a])
Node Classification	USA Air-Traffic	Accuracy	31.6	GraphSAGE (Hamilton et al., [2017a])
Node Classification	PATTERN 100k	Accuracy (%)	50.516	GraphSage
Link Property Prediction	ogbl-ddi	Number of params	1421057	GraphSAGE
Link Property Prediction	ogbl-citation2	Number of params	460289	Full-batch GraphSAGE
Link Property Prediction	ogbl-citation2	Number of params	460289	NeighborSampling (SAGE aggr)
Link Property Prediction	ogbl-collab	Number of params	460289	GraphSAGE (val as input)
Link Property Prediction	ogbl-collab	Number of params	460289	GraphSAGE
Link Property Prediction	ogbl-collab	Number of params	460289	GraphSAGE (val as input)
Link Property Prediction	ogbl-ppa	Number of params	424449	GraphSAGE
Classification	CIFAR10 100k	Accuracy (%)	66.08	GraphSage
Node Property Prediction	ogbn-arxiv	Number of params	218664	GraphSAGE
Node Property Prediction	ogbn-papers100M	Number of params	5755172	GraphSAGE_res_incep
Node Property Prediction	ogbn-products	Number of params	103983	GraphSAGE + C&S + node2vec
Node Property Prediction	ogbn-products	Number of params	206895	NeighborSampling (SAGE aggr)
Node Property Prediction	ogbn-products	Number of params	206895	Full-batch GraphSAGE
Node Property Prediction	ogbn-proteins	Number of params	193136	GraphSAGE
Node Property Prediction	ogbn-mag	Number of params	154366772	NeighborSampling (R-GCN aggr)

Inductive Representation Learning on Large Graphs

Abstract

Results

Related Papers

Inductive Representation Learning on Large Graphs

Abstract

Results

Related Papers