TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/OGB-LSC: A Large-Scale Challenge for Machine Learning on G...

OGB-LSC: A Large-Scale Challenge for Machine Learning on Graphs

Weihua Hu, Matthias Fey, Hongyu Ren, Maho Nakata, Yuxiao Dong, Jure Leskovec

2021-03-17Knowledge GraphsGraph RegressionGraph LearningNode ClassificationBIG-bench Machine LearningLink Prediction
PaperPDFCodeCodeCodeCode(official)CodeCode

Abstract

Enabling effective and efficient machine learning (ML) over large-scale graph data (e.g., graphs with billions of edges) can have a great impact on both industrial and scientific applications. However, existing efforts to advance large-scale graph ML have been largely limited by the lack of a suitable public benchmark. Here we present OGB Large-Scale Challenge (OGB-LSC), a collection of three real-world datasets for facilitating the advancements in large-scale graph ML. The OGB-LSC datasets are orders of magnitude larger than existing ones, covering three core graph learning tasks -- link prediction, graph regression, and node classification. Furthermore, we provide dedicated baseline experiments, scaling up expressive graph ML models to the massive datasets. We show that expressive models significantly outperform simple scalable baselines, indicating an opportunity for dedicated efforts to further improve graph ML at scale. Moreover, OGB-LSC datasets were deployed at ACM KDD Cup 2021 and attracted more than 500 team registrations globally, during which significant performance improvements were made by a variety of innovative techniques. We summarize the common techniques used by the winning solutions and highlight the current best practices in large-scale graph ML. Finally, we describe how we have updated the datasets after the KDD Cup to further facilitate research advances. The OGB-LSC datasets, baseline code, and all the information about the KDD Cup are available at https://ogb.stanford.edu/docs/lsc/ .

Results

TaskDatasetMetricValueModel
Knowledge GraphsWikiKG90M-LSCTest MRR85.48TransE-Concat
Knowledge GraphsWikiKG90M-LSCValidation MRR0.8494TransE-Concat
Knowledge GraphsWikiKG90M-LSCTest MRR0.8637ComplEx-Concat
Knowledge GraphsWikiKG90M-LSCValidation MRR0.8425ComplEx-Concat
Knowledge GraphsWikiKG90M-LSCTest MRR0.7186ComplEx-RoBERTa
Knowledge GraphsWikiKG90M-LSCValidation MRR0.7052ComplEx-RoBERTa
Knowledge GraphsWikiKG90M-LSCTest MRR0.6288TransE-RoBERTa
Knowledge GraphsWikiKG90M-LSCValidation MRR0.6039TransE-RoBERTa
Graph RegressionPCQM4Mv2-LSCTest MAE0.176MLP-Fingerprint
Graph RegressionPCQM4Mv2-LSCValidation MAE0.1753MLP-Fingerprint
Graph RegressionPCQM4M-LSCTest MAE14.87GIN-virtual
Graph RegressionPCQM4M-LSCValidation MAE0.1396GIN-virtual
Graph RegressionPCQM4M-LSCTest MAE15.79GCN-Virtual
Graph RegressionPCQM4M-LSCValidation MAE0.1536GCN-Virtual
Graph RegressionPCQM4M-LSCTest MAE16.78GIN
Graph RegressionPCQM4M-LSCTest MAE18.38GCN
Graph RegressionPCQM4M-LSCValidation MAE0.1684GCN
Graph RegressionPCQM4M-LSCTest MAE20.68MLP-fingerprint
Graph RegressionPCQM4M-LSCValidation MAE0.2044MLP-fingerprint
Node ClassificationMAG240M-LSCTest Accuracy68.94R-GraphSAGE (NS)
Node ClassificationMAG240M-LSCTest Accuracy66.63GAT (NS)
Node ClassificationMAG240M-LSCTest Accuracy66.25GraphSAGE (NS)
Node ClassificationMAG240M-LSCTest Accuracy66.09SIGN
Node ClassificationMAG240M-LSCValidation Accuracy66.64SIGN
Knowledge Graph CompletionWikiKG90M-LSCTest MRR85.48TransE-Concat
Knowledge Graph CompletionWikiKG90M-LSCValidation MRR0.8494TransE-Concat
Knowledge Graph CompletionWikiKG90M-LSCTest MRR0.8637ComplEx-Concat
Knowledge Graph CompletionWikiKG90M-LSCValidation MRR0.8425ComplEx-Concat
Knowledge Graph CompletionWikiKG90M-LSCTest MRR0.7186ComplEx-RoBERTa
Knowledge Graph CompletionWikiKG90M-LSCValidation MRR0.7052ComplEx-RoBERTa
Knowledge Graph CompletionWikiKG90M-LSCTest MRR0.6288TransE-RoBERTa
Knowledge Graph CompletionWikiKG90M-LSCValidation MRR0.6039TransE-RoBERTa
Large Language ModelWikiKG90M-LSCTest MRR85.48TransE-Concat
Large Language ModelWikiKG90M-LSCValidation MRR0.8494TransE-Concat
Large Language ModelWikiKG90M-LSCTest MRR0.8637ComplEx-Concat
Large Language ModelWikiKG90M-LSCValidation MRR0.8425ComplEx-Concat
Large Language ModelWikiKG90M-LSCTest MRR0.7186ComplEx-RoBERTa
Large Language ModelWikiKG90M-LSCValidation MRR0.7052ComplEx-RoBERTa
Large Language ModelWikiKG90M-LSCTest MRR0.6288TransE-RoBERTa
Large Language ModelWikiKG90M-LSCValidation MRR0.6039TransE-RoBERTa
Inductive knowledge graph completionWikiKG90M-LSCTest MRR85.48TransE-Concat
Inductive knowledge graph completionWikiKG90M-LSCValidation MRR0.8494TransE-Concat
Inductive knowledge graph completionWikiKG90M-LSCTest MRR0.8637ComplEx-Concat
Inductive knowledge graph completionWikiKG90M-LSCValidation MRR0.8425ComplEx-Concat
Inductive knowledge graph completionWikiKG90M-LSCTest MRR0.7186ComplEx-RoBERTa
Inductive knowledge graph completionWikiKG90M-LSCValidation MRR0.7052ComplEx-RoBERTa
Inductive knowledge graph completionWikiKG90M-LSCTest MRR0.6288TransE-RoBERTa
Inductive knowledge graph completionWikiKG90M-LSCValidation MRR0.6039TransE-RoBERTa

Related Papers

SMART: Relation-Aware Learning of Geometric Representations for Knowledge Graphs2025-07-17SGCL: Unifying Self-Supervised and Supervised Learning for Graph Recommendation2025-07-17A Graph-in-Graph Learning Framework for Drug-Target Interaction Prediction2025-07-15Graph World Model2025-07-14Federated Learning with Graph-Based Aggregation for Traffic Forecasting2025-07-13Topic Modeling and Link-Prediction for Material Property Discovery2025-07-08Graph Learning2025-07-08Graph Collaborative Attention Network for Link Prediction in Knowledge Graphs2025-07-05