TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/BioMedGPT: Open Multimodal Generative Pre-trained Transfor...

BioMedGPT: Open Multimodal Generative Pre-trained Transformer for BioMedicine

2023-08-18Question AnsweringFew-Shot LearningZero-Shot LearningLanguage ModellingMultiple Choice Question Answering (MCQA)
PaperPDFCode(official)

Abstract

Foundation models (FMs) have exhibited remarkable performance across a wide range of downstream tasks in many domains. Nevertheless, general-purpose FMs often face challenges when confronted with domain-specific problems, due to their limited access to the proprietary training data in a particular domain. In biomedicine, there are various biological modalities, such as molecules, proteins, and cells, which are encoded by the language of life and exhibit significant modality gaps with human natural language. In this paper, we introduce BioMedGPT, an open multimodal generative pre-trained transformer (GPT) for biomedicine, to bridge the gap between the language of life and human natural language. BioMedGPT allows users to easily ``communicate'' with diverse biological modalities through free text, which is the first of its kind. BioMedGPT aligns different biological modalities with natural language via a large generative language model, namely, BioMedGPT-LM. We publish BioMedGPT-10B, which unifies the feature spaces of molecules, proteins, and natural language via encoding and alignment. Through fine-tuning, BioMedGPT-10B outperforms or is on par with human and significantly larger general-purpose foundation models on the biomedical QA task. It also demonstrates promising performance in the molecule QA and protein QA tasks, which could greatly accelerate the discovery of new drugs and therapeutic targets. In addition, BioMedGPT-LM-7B is the first large generative language model based on Llama2 in the biomedical domain, therefore is commercial friendly. Both BioMedGPT-10B and BioMedGPT-LM-7B are open-sourced to the research community. In addition, we publish the datasets that are meticulously curated for the alignment of multi-modalities, i.e., PubChemQA and UniProtQA. All the models, codes, and datasets are available at \url{https://github.com/PharMolix/OpenBioMed}.

Results

TaskDatasetMetricValueModel
Few-Shot LearningMedConceptsQAAccuracy24.924PharMolix/BioMedGPT-LM-7B
Zero-Shot LearningMedConceptsQAAccuracy24.747PharMolix/BioMedGPT-LM-7B
Question AnsweringPubMedQAAccuracy76.1BioMedGPT-10B
Question AnsweringMedQAAccuracy50.4BioMedGPT-10B
Question AnsweringUniProtQABLEU-20.571BioMedGPT-10B
Question AnsweringUniProtQABLEU-40.535BioMedGPT-10B
Question AnsweringUniProtQAMEATOR0.754BioMedGPT-10B
Question AnsweringUniProtQAROUGE-10.743BioMedGPT-10B
Question AnsweringUniProtQAROUGE-20.759BioMedGPT-10B
Question AnsweringUniProtQAROUGE-L0.622BioMedGPT-10B
Question AnsweringPubChemQABLEU-20.234BioMedGPT-10B
Question AnsweringPubChemQABLEU-40.141BioMedGPT-10B
Question AnsweringPubChemQAMEATOR0.308BioMedGPT-10B
Question AnsweringPubChemQAROUGE-10.386BioMedGPT-10B
Question AnsweringPubChemQAROUGE-20.206BioMedGPT-10B
Question AnsweringPubChemQAROUGE-L0.332BioMedGPT-10B
Question AnsweringMMLU (Professional medicine)Accuracy51.1BioMedGPT-LM-7B
Question AnsweringMedMCQATest Set (Acc-%)0.514BioMedGPT-10B
Meta-LearningMedConceptsQAAccuracy24.924PharMolix/BioMedGPT-LM-7B

Related Papers

Visual-Language Model Knowledge Distillation Method for Image Quality Assessment2025-07-21From Roots to Rewards: Dynamic Tree Reasoning with RL2025-07-17Enter the Mind Palace: Reasoning and Planning for Long-term Active Embodied Question Answering2025-07-17Vision-and-Language Training Helps Deploy Taxonomic Knowledge but Does Not Fundamentally Alter It2025-07-17City-VLM: Towards Multidomain Perception Scene Understanding via Multimodal Incomplete Learning2025-07-17GLAD: Generalizable Tuning for Vision-Language Models2025-07-17Making Language Model a Hierarchical Classifier and Generator2025-07-17VisionThink: Smart and Efficient Vision Language Model via Reinforcement Learning2025-07-17