TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/GIT-Mol: A Multi-modal Large Language Model for Molecular ...

GIT-Mol: A Multi-modal Large Language Model for Molecular Science with Graph, Image, and Text

PengFei Liu, Yiming Ren, Jun Tao, Zhixiang Ren

2023-08-14Drug DiscoveryImage CaptioningLarge Language ModelText-based de novo Molecule GenerationLanguage ModellingMolecule Captioning
PaperPDFCode(official)

Abstract

Large language models have made significant strides in natural language processing, enabling innovative applications in molecular science by processing textual representations of molecules. However, most existing language models cannot capture the rich information with complex molecular structures or images. In this paper, we introduce GIT-Mol, a multi-modal large language model that integrates the Graph, Image, and Text information. To facilitate the integration of multi-modal molecular data, we propose GIT-Former, a novel architecture that is capable of aligning all modalities into a unified latent space. We achieve a 5%-10% accuracy increase in properties prediction and a 20.2% boost in molecule generation validity compared to the baselines. With the any-to-language molecular translation strategy, our model has the potential to perform more downstream tasks, such as compound name recognition and chemical reaction prediction.

Results

TaskDatasetMetricValueModel
Drug DiscoveryclintoxAUC0.883GIT-Mol(G+S)
Drug DiscoveryBACEAUC0.8108GIT-Mol(G+S)
Drug DiscoveryTox21AUC0.759GIT-Mol(G+S)
Drug DiscoveryBBBPAUC0.739GIT-Mol(G+S)
Drug DiscoveryToxCastAUC0.668GIT-Mol(G+S)
Drug DiscoverySIDERAUC0.634GIT-Mol(G+S)
Drug DiscoveryChEBI-20BLEU75.6GIT-Mol-caption
Drug DiscoveryChEBI-20Exact Match5.1GIT-Mol-caption
Drug DiscoveryChEBI-20Levenshtein26.315GIT-Mol-caption
Drug DiscoveryChEBI-20MACCS FTS73.8GIT-Mol-caption
Drug DiscoveryChEBI-20Morgan FTS51.9GIT-Mol-caption
Drug DiscoveryChEBI-20RDK FTS58.2GIT-Mol-caption
Drug DiscoveryChEBI-20Validity92.8GIT-Mol-caption
Image CaptioningChEBI-20BLEU0.924GIT-Mol
Image CaptioningChEBI-20Exact0.461GIT-Mol
Image CaptioningChEBI-20Levenshtein6.575GIT-Mol
Image CaptioningChEBI-20MACCS FTS0.962GIT-Mol
Image CaptioningChEBI-20Morgan FTS0.894GIT-Mol
Image CaptioningChEBI-20RDK FTS0.906GIT-Mol
Image CaptioningChEBI-20Validity0.899GIT-Mol
Text-based de novo Molecule GenerationChEBI-20BLEU75.6GIT-Mol-caption
Text-based de novo Molecule GenerationChEBI-20Exact Match5.1GIT-Mol-caption
Text-based de novo Molecule GenerationChEBI-20Levenshtein26.315GIT-Mol-caption
Text-based de novo Molecule GenerationChEBI-20MACCS FTS73.8GIT-Mol-caption
Text-based de novo Molecule GenerationChEBI-20Morgan FTS51.9GIT-Mol-caption
Text-based de novo Molecule GenerationChEBI-20RDK FTS58.2GIT-Mol-caption
Text-based de novo Molecule GenerationChEBI-20Validity92.8GIT-Mol-caption

Related Papers

Visual-Language Model Knowledge Distillation Method for Image Quality Assessment2025-07-21DENSE: Longitudinal Progress Note Generation with Temporal Modeling of Heterogeneous Clinical Notes Across Hospital Visits2025-07-18GeoReg: Weight-Constrained Few-Shot Regression for Socio-Economic Estimation using LLM2025-07-17The Generative Energy Arena (GEA): Incorporating Energy Awareness in Large Language Model (LLM) Human Evaluations2025-07-17Inverse Reinforcement Learning Meets Large Language Model Post-Training: Basics, Advances, and Opportunities2025-07-17Rethinking the Embodied Gap in Vision-and-Language Navigation: A Holistic Study of Physical and Visual Disparities2025-07-17Making Language Model a Hierarchical Classifier and Generator2025-07-17VisionThink: Smart and Efficient Vision Language Model via Reinforcement Learning2025-07-17