TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/On the Effectiveness of Compact Biomedical Transformers

On the Effectiveness of Compact Biomedical Transformers

Omid Rohanian, Mohammadmahdi Nouriborji, Samaneh Kouchaki, David A. Clifton

2022-09-07Continual LearningKnowledge DistillationNamed Entity Recognition (NER)Language Modelling
PaperPDFCode(official)

Abstract

Language models pre-trained on biomedical corpora, such as BioBERT, have recently shown promising results on downstream biomedical tasks. Many existing pre-trained models, on the other hand, are resource-intensive and computationally heavy owing to factors such as embedding size, hidden dimension, and number of layers. The natural language processing (NLP) community has developed numerous strategies to compress these models utilising techniques such as pruning, quantisation, and knowledge distillation, resulting in models that are considerably faster, smaller, and subsequently easier to use in practice. By the same token, in this paper we introduce six lightweight models, namely, BioDistilBERT, BioTinyBERT, BioMobileBERT, DistilBioBERT, TinyBioBERT, and CompactBioBERT which are obtained either by knowledge distillation from a biomedical teacher or continual learning on the Pubmed dataset via the Masked Language Modelling (MLM) objective. We evaluate all of our models on three biomedical tasks and compare them with BioBERT-v1.1 to create efficient lightweight models that perform on par with their larger counterparts. All the models will be publicly available on our Huggingface profile at https://huggingface.co/nlpie and the codes used to run the experiments will be available at https://github.com/nlpie-research/Compact-Biomedical-Transformers.

Results

TaskDatasetMetricValueModel
Named Entity Recognition (NER)NCBI-diseaseF188.67CompactBioBERT
Named Entity Recognition (NER)NCBI-diseaseF187.93DistilBioBERT
Named Entity Recognition (NER)NCBI-diseaseF187.61BioDistilBERT
Named Entity Recognition (NER)NCBI-diseaseF187.21BioMobileBERT
Named Entity Recognition (NER)BC5CDR-chemicalF194.53DistilBioBERT
Named Entity Recognition (NER)BC5CDR-chemicalF194.48BioDistilBERT
Named Entity Recognition (NER)BC5CDR-chemicalF194.31CompactBioBERT
Named Entity Recognition (NER)BC5CDR-chemicalF194.23BioMobileBERT
Named Entity Recognition (NER)BC5CDR-diseaseF185.61BioDistilBERT
Named Entity Recognition (NER)BC5CDR-diseaseF185.42DistilBioBERT
Named Entity Recognition (NER)BC5CDR-diseaseF185.38CompactBioBERT
Named Entity Recognition (NER)BC5CDR-diseaseF184.62BioMobileBERT
Named Entity Recognition (NER)BC2GMF186.97BioDistilBERT
Named Entity Recognition (NER)BC2GMF186.71CompactBioBERT
Named Entity Recognition (NER)BC2GMF186.6DistilBioBERT
Named Entity Recognition (NER)BC2GMF185.26BioMobileBERT
Named Entity Recognition (NER)JNLPBAF180.13BioMobileBERT
Named Entity Recognition (NER)JNLPBAF179.97DistilBioBERT
Named Entity Recognition (NER)JNLPBAF179.88CompactBioBERT
Named Entity Recognition (NER)JNLPBAF179.1BioDistilBERT

Related Papers

Visual-Language Model Knowledge Distillation Method for Image Quality Assessment2025-07-21Uncertainty-Aware Cross-Modal Knowledge Distillation with Prototype Learning for Multimodal Brain-Computer Interfaces2025-07-17Making Language Model a Hierarchical Classifier and Generator2025-07-17VisionThink: Smart and Efficient Vision Language Model via Reinforcement Learning2025-07-17The Generative Energy Arena (GEA): Incorporating Energy Awareness in Large Language Model (LLM) Human Evaluations2025-07-17Inverse Reinforcement Learning Meets Large Language Model Post-Training: Basics, Advances, and Opportunities2025-07-17RegCL: Continual Adaptation of Segment Anything Model via Model Merging2025-07-16Information-Theoretic Generalization Bounds of Replay-based Continual Learning2025-07-16