TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Domain-Specific Language Model Pretraining for Biomedical ...

Domain-Specific Language Model Pretraining for Biomedical Natural Language Processing

Yu Gu, Robert Tinn, Hao Cheng, Michael Lucas, Naoto Usuyama, Xiaodong Liu, Tristan Naumann, Jianfeng Gao, Hoifung Poon

2020-07-31Text ClassificationParticipant Intervention Comparison Outcome ExtractionQuestion AnsweringRelation ExtractionSentence Similaritynamed-entity-recognitionNamed Entity RecognitionContinual PretrainingNERDocument ClassificationDrug–drug Interaction ExtractionNamed Entity Recognition (NER)Language Modelling
PaperPDFCodeCode

Abstract

Pretraining large neural language models, such as BERT, has led to impressive gains on many natural language processing (NLP) tasks. However, most pretraining efforts focus on general domain corpora, such as newswire and Web. A prevailing assumption is that even domain-specific pretraining can benefit by starting from general-domain language models. In this paper, we challenge this assumption by showing that for domains with abundant unlabeled text, such as biomedicine, pretraining language models from scratch results in substantial gains over continual pretraining of general-domain language models. To facilitate this investigation, we compile a comprehensive biomedical NLP benchmark from publicly-available datasets. Our experiments show that domain-specific pretraining serves as a solid foundation for a wide range of biomedical NLP tasks, leading to new state-of-the-art results across the board. Further, in conducting a thorough evaluation of modeling choices, both for pretraining and task-specific fine-tuning, we discover that some common practices are unnecessary with BERT models, such as using complex tagging schemes in named entity recognition (NER). To help accelerate research in biomedical NLP, we have released our state-of-the-art pretrained and task-specific models for the community, and created a leaderboard featuring our BLURB benchmark (short for Biomedical Language Understanding & Reasoning Benchmark) at https://aka.ms/BLURB.

Results

TaskDatasetMetricValueModel
Relation ExtractionGADMicro F182.34PubMedBERT uncased
Relation ExtractionDDIMicro F182.36PubMedBERT uncased
Relation ExtractionChemProtMicro F177.24PubMedBERT uncased
Question AnsweringBLURBAccuracy71.7PubMedBERT (uncased; abstracts)
Question AnsweringPubMedQAAccuracy55.84PubMedBERT uncased
Question AnsweringBioASQAccuracy87.56PubMedBERT uncased
Information ExtractionDDI extraction 2013 corpusF10.8236PubMedBERT
Information ExtractionDDI extraction 2013 corpusMicro F182.36PubMedBERT
Information ExtractionEBM-NLPF173.38PubMedBERT uncased
Named Entity Recognition (NER)NCBI-diseaseF187.82PubMedBERT uncased
Named Entity Recognition (NER)BC2GMF184.52PubMedBERT uncased
Named Entity Recognition (NER)JNLPBAF179.1PubMedBERT uncased
Text ClassificationBLURBF182.32PubMedBERT (uncased; abstracts)
Text ClassificationHOCMicro F182.32PubMedBERT uncased
Participant Intervention Comparison Outcome ExtractionEBM-NLPF173.38PubMedBERT uncased
Document ClassificationHOCMicro F182.32PubMedBERT uncased
Biomedical Information RetrievalEBM PICOMacro F1 word level73.38PubMedBERT uncased
ClassificationBLURBF182.32PubMedBERT (uncased; abstracts)
ClassificationHOCMicro F182.32PubMedBERT uncased

Related Papers

Visual-Language Model Knowledge Distillation Method for Image Quality Assessment2025-07-21Making Language Model a Hierarchical Classifier and Generator2025-07-17From Roots to Rewards: Dynamic Tree Reasoning with RL2025-07-17Enter the Mind Palace: Reasoning and Planning for Long-term Active Embodied Question Answering2025-07-17Vision-and-Language Training Helps Deploy Taxonomic Knowledge but Does Not Fundamentally Alter It2025-07-17City-VLM: Towards Multidomain Perception Scene Understanding via Multimodal Incomplete Learning2025-07-17VisionThink: Smart and Efficient Vision Language Model via Reinforcement Learning2025-07-17The Generative Energy Arena (GEA): Incorporating Energy Awareness in Large Language Model (LLM) Human Evaluations2025-07-17