TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Molecule3D: A Benchmark for Predicting 3D Geometries from ...

Molecule3D: A Benchmark for Predicting 3D Geometries from Molecular Graphs

Zhao Xu, Youzhi Luo, Xuan Zhang, Xinyi Xu, Yaochen Xie, Meng Liu, Kaleb Dickerson, Cheng Deng, Maho Nakata, Shuiwang Ji

2021-09-30Molecular Property Prediction3D Geometry Prediction
PaperPDFCodeCode(official)Code

Abstract

Graph neural networks are emerging as promising methods for modeling molecular graphs, in which nodes and edges correspond to atoms and chemical bonds, respectively. Recent studies show that when 3D molecular geometries, such as bond lengths and angles, are available, molecular property prediction tasks can be made more accurate. However, computing of 3D molecular geometries requires quantum calculations that are computationally prohibitive. For example, accurate calculation of 3D geometries of a small molecule requires hours of computing time using density functional theory (DFT). Here, we propose to predict the ground-state 3D geometries from molecular graphs using machine learning methods. To make this feasible, we develop a benchmark, known as Molecule3D, that includes a dataset with precise ground-state geometries of approximately 4 million molecules derived from DFT. We also provide a set of software tools for data processing, splitting, training, and evaluation, etc. Specifically, we propose to assess the error and validity of predicted geometries using four metrics. We implement two baseline methods that either predict the pairwise distance between atoms or atom coordinates in 3D space. Experimental results show that, compared with generating 3D geometries with RDKit, our method can achieve comparable prediction accuracy but with much smaller computational costs. Our Molecule3D is available as a module of the MoleculeX software library (https://github.com/divelab/MoleculeX).

Results

TaskDatasetMetricValueModel
Molecular Property PredictionMolecule3D valMAE0.482DeeperGCN-DAGNN + Distance
Molecular Property PredictionMolecule3D valRMSE0.749DeeperGCN-DAGNN + Distance
Molecular Property PredictionMolecule3D valValidity1.71DeeperGCN-DAGNN + Distance
Molecular Property PredictionMolecule3D valValidity3D0.02DeeperGCN-DAGNN + Distance
Molecular Property PredictionMolecule3D valMAE0.509DeeperGCN-DAGNN + Coordinates
Molecular Property PredictionMolecule3D valRMSE0.849DeeperGCN-DAGNN + Coordinates
Molecular Property PredictionMolecule3D valValidity100DeeperGCN-DAGNN + Coordinates
Molecular Property PredictionMolecule3D valValidity3D100DeeperGCN-DAGNN + Coordinates
Molecular Property PredictionMolecule3D testMAE0.483DeeperGCN-DAGNN + Distance
Molecular Property PredictionMolecule3D testRMSE0.753DeeperGCN-DAGNN + Distance
Molecular Property PredictionMolecule3D testValidity1.69DeeperGCN-DAGNN + Distance
Molecular Property PredictionMolecule3D testValidity3D0.03DeeperGCN-DAGNN + Distance
Molecular Property PredictionMolecule3D testMAE0.571DeeperGCN-DAGNN + Coordinates
Molecular Property PredictionMolecule3D testRMSE0.961DeeperGCN-DAGNN + Coordinates
Molecular Property PredictionMolecule3D testValidity100DeeperGCN-DAGNN + Coordinates
Molecular Property PredictionMolecule3D testValidity3D100DeeperGCN-DAGNN + Coordinates
Atomistic DescriptionMolecule3D valMAE0.482DeeperGCN-DAGNN + Distance
Atomistic DescriptionMolecule3D valRMSE0.749DeeperGCN-DAGNN + Distance
Atomistic DescriptionMolecule3D valValidity1.71DeeperGCN-DAGNN + Distance
Atomistic DescriptionMolecule3D valValidity3D0.02DeeperGCN-DAGNN + Distance
Atomistic DescriptionMolecule3D valMAE0.509DeeperGCN-DAGNN + Coordinates
Atomistic DescriptionMolecule3D valRMSE0.849DeeperGCN-DAGNN + Coordinates
Atomistic DescriptionMolecule3D valValidity100DeeperGCN-DAGNN + Coordinates
Atomistic DescriptionMolecule3D valValidity3D100DeeperGCN-DAGNN + Coordinates
Atomistic DescriptionMolecule3D testMAE0.483DeeperGCN-DAGNN + Distance
Atomistic DescriptionMolecule3D testRMSE0.753DeeperGCN-DAGNN + Distance
Atomistic DescriptionMolecule3D testValidity1.69DeeperGCN-DAGNN + Distance
Atomistic DescriptionMolecule3D testValidity3D0.03DeeperGCN-DAGNN + Distance
Atomistic DescriptionMolecule3D testMAE0.571DeeperGCN-DAGNN + Coordinates
Atomistic DescriptionMolecule3D testRMSE0.961DeeperGCN-DAGNN + Coordinates
Atomistic DescriptionMolecule3D testValidity100DeeperGCN-DAGNN + Coordinates
Atomistic DescriptionMolecule3D testValidity3D100DeeperGCN-DAGNN + Coordinates

Related Papers

Acquiring and Adapting Priors for Novel Tasks via Neural Meta-Architectures2025-07-07Combining Graph Neural Networks and Mixed Integer Linear Programming for Molecular Inference under the Two-Layered Model2025-07-05TRIDENT: Tri-Modal Molecular Representation Learning with Taxonomic Annotations and Local Correspondence2025-06-26Descriptor-based Foundation Models for Molecular Property Prediction2025-06-18Robust Molecular Property Prediction via Densifying Scarce Labeled Data2025-06-13BioLangFusion: Multimodal Fusion of DNA, mRNA, and Protein Language Models2025-06-10The Catechol Benchmark: Time-series Solvent Selection Data for Few-shot Machine Learning2025-06-09Graph Neural Networks in Modern AI-aided Drug Discovery2025-06-07