TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/MolCA: Molecular Graph-Language Modeling with Cross-Modal ...

MolCA: Molecular Graph-Language Modeling with Cross-Modal Projector and Uni-Modal Adapter

Zhiyuan Liu, Sihang Li, Yanchen Luo, Hao Fei, Yixin Cao, Kenji Kawaguchi, Xiang Wang, Tat-Seng Chua

2023-10-19Text GenerationIUPAC Name PredictionText RetrievalContrastive LearningRetrievalLanguage ModellingMolecule Captioning
PaperPDFCode(official)

Abstract

Language Models (LMs) have demonstrated impressive molecule understanding ability on various 1D text-related tasks. However, they inherently lack 2D graph perception - a critical ability of human professionals in comprehending molecules' topological structures. To bridge this gap, we propose MolCA: Molecular Graph-Language Modeling with Cross-Modal Projector and Uni-Modal Adapter. MolCA enables an LM (e.g., Galactica) to understand both text- and graph-based molecular contents via the cross-modal projector. Specifically, the cross-modal projector is implemented as a Q-Former to connect a graph encoder's representation space and an LM's text space. Further, MolCA employs a uni-modal adapter (i.e., LoRA) for the LM's efficient adaptation to downstream tasks. Unlike previous studies that couple an LM with a graph encoder via cross-modal contrastive learning, MolCA retains the LM's ability of open-ended text generation and augments it with 2D graph information. To showcase its effectiveness, we extensively benchmark MolCA on tasks of molecule captioning, IUPAC name prediction, and molecule-text retrieval, on which MolCA significantly outperforms the baselines. Our codes and checkpoints can be found at https://github.com/acharkq/MolCA.

Results

TaskDatasetMetricValueModel
Molecule CaptioningChEBI-20BLEU-262MolCA, Galac1.3B
Molecule CaptioningChEBI-20BLEU-453.1MolCA, Galac1.3B
Molecule CaptioningChEBI-20METEOR65.1MolCA, Galac1.3B
Molecule CaptioningChEBI-20ROUGE-168.1MolCA, Galac1.3B
Molecule CaptioningChEBI-20ROUGE-253.7MolCA, Galac1.3B
Molecule CaptioningChEBI-20ROUGE-L61.8MolCA, Galac1.3B
Molecule CaptioningChEBI-20BLEU-261.6MolCA, Galac125M
Molecule CaptioningChEBI-20BLEU-452.9MolCA, Galac125M
Molecule CaptioningChEBI-20METEOR63.9MolCA, Galac125M
Molecule CaptioningChEBI-20ROUGE-167.4MolCA, Galac125M
Molecule CaptioningChEBI-20ROUGE-253.3MolCA, Galac125M
Molecule CaptioningChEBI-20ROUGE-L61.5MolCA, Galac125M

Related Papers

Visual-Language Model Knowledge Distillation Method for Image Quality Assessment2025-07-21Making Language Model a Hierarchical Classifier and Generator2025-07-17SemCSE: Semantic Contrastive Sentence Embeddings Using LLM-Generated Summaries For Scientific Abstracts2025-07-17HapticCap: A Multimodal Dataset and Task for Understanding User Experience of Vibration Haptic Signals2025-07-17Overview of the TalentCLEF 2025: Skill and Job Title Intelligence for Human Capital Management2025-07-17SGCL: Unifying Self-Supervised and Supervised Learning for Graph Recommendation2025-07-17From Roots to Rewards: Dynamic Tree Reasoning with RL2025-07-17A Survey of Context Engineering for Large Language Models2025-07-17