TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/ChemRL-GEM: Geometry Enhanced Molecular Representation Lea...

ChemRL-GEM: Geometry Enhanced Molecular Representation Learning for Property Prediction

Xiaomin Fang, Lihang Liu, Jieqiong Lei, Donglong He, Shanzhuo Zhang, Jingbo Zhou, Fan Wang, Hua Wu, Haifeng Wang

2021-06-11Molecular Property PredictionRepresentation LearningregressionSelf-Supervised Learning
PaperPDF

Abstract

Effective molecular representation learning is of great importance to facilitate molecular property prediction, which is a fundamental task for the drug and material industry. Recent advances in graph neural networks (GNNs) have shown great promise in applying GNNs for molecular representation learning. Moreover, a few recent studies have also demonstrated successful applications of self-supervised learning methods to pre-train the GNNs to overcome the problem of insufficient labeled molecules. However, existing GNNs and pre-training strategies usually treat molecules as topological graph data without fully utilizing the molecular geometry information. Whereas, the three-dimensional (3D) spatial structure of a molecule, a.k.a molecular geometry, is one of the most critical factors for determining molecular physical, chemical, and biological properties. To this end, we propose a novel Geometry Enhanced Molecular representation learning method (GEM) for Chemical Representation Learning (ChemRL). At first, we design a geometry-based GNN architecture that simultaneously models atoms, bonds, and bond angles in a molecule. To be specific, we devised double graphs for a molecule: The first one encodes the atom-bond relations; The second one encodes bond-angle relations. Moreover, on top of the devised GNN architecture, we propose several novel geometry-level self-supervised learning strategies to learn spatial knowledge by utilizing the local and global molecular 3D structures. We compare ChemRL-GEM with various state-of-the-art (SOTA) baselines on different molecular benchmarks and exhibit that ChemRL-GEM can significantly outperform all baselines in both regression and classification tasks. For example, the experimental results show an overall improvement of 8.8% on average compared to SOTA baselines on the regression tasks, demonstrating the superiority of the proposed method.

Results

TaskDatasetMetricValueModel
Molecular Property PredictionFreeSolvRMSE1.877ChemRL-GEM
Molecular Property PredictionclintoxMolecules (M)20ChemRL-GEM
Molecular Property PredictionclintoxROC-AUC90.1ChemRL-GEM
Molecular Property PredictionToxCastROC-AUC69.2ChemRL-GEM
Molecular Property PredictionLipophilicityRMSE0.66ChemRL-GEM
Molecular Property PredictionQM7MAE58.9ChemRL-GEM
Molecular Property PredictionBBBPROC-AUC72.4ChemRL-GEM
Molecular Property PredictionQM9MAE0.00746ChemRL-GEM
Molecular Property PredictionQM8MAE0.0171ChemRL-GEM
Molecular Property PredictionSIDERROC-AUC67.2ChemRL-GEM
Molecular Property PredictionTox21ROC-AUC78.1ChemRL-GEM
Molecular Property PredictionBACEROC-AUC85.6ChemRL-GEM
Molecular Property PredictionESOLRMSE0.798ChemRL-GEM
Atomistic DescriptionFreeSolvRMSE1.877ChemRL-GEM
Atomistic DescriptionclintoxMolecules (M)20ChemRL-GEM
Atomistic DescriptionclintoxROC-AUC90.1ChemRL-GEM
Atomistic DescriptionToxCastROC-AUC69.2ChemRL-GEM
Atomistic DescriptionLipophilicityRMSE0.66ChemRL-GEM
Atomistic DescriptionQM7MAE58.9ChemRL-GEM
Atomistic DescriptionBBBPROC-AUC72.4ChemRL-GEM
Atomistic DescriptionQM9MAE0.00746ChemRL-GEM
Atomistic DescriptionQM8MAE0.0171ChemRL-GEM
Atomistic DescriptionSIDERROC-AUC67.2ChemRL-GEM
Atomistic DescriptionTox21ROC-AUC78.1ChemRL-GEM
Atomistic DescriptionBACEROC-AUC85.6ChemRL-GEM
Atomistic DescriptionESOLRMSE0.798ChemRL-GEM

Related Papers

Touch in the Wild: Learning Fine-Grained Manipulation with a Portable Visuo-Tactile Gripper2025-07-20Language Integration in Fine-Tuning Multimodal Large Language Models for Image-Based Regression2025-07-20Spectral Bellman Method: Unifying Representation and Exploration in RL2025-07-17Boosting Team Modeling through Tempo-Relational Representation Learning2025-07-17A Semi-Supervised Learning Method for the Identification of Bad Exposures in Large Imaging Surveys2025-07-17Similarity-Guided Diffusion for Contrastive Sequential Recommendation2025-07-16Are encoders able to learn landmarkers for warm-starting of Hyperparameter Optimization?2025-07-16Language-Guided Contrastive Audio-Visual Masked Autoencoder with Automatically Generated Audio-Visual-Text Triplets from Videos2025-07-16