TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/K-Adapter: Infusing Knowledge into Pre-Trained Models with...

K-Adapter: Infusing Knowledge into Pre-Trained Models with Adapters

Ruize Wang, Duyu Tang, Nan Duan, Zhongyu Wei, Xuanjing Huang, Jianshu ji, Guihong Cao, Daxin Jiang, Ming Zhou

2020-02-05Findings (ACL) 2021 8Question AnsweringRelation ExtractionRelation ClassificationDependency ParsingEntity Typing
PaperPDFCodeCode

Abstract

We study the problem of injecting knowledge into large pre-trained models like BERT and RoBERTa. Existing methods typically update the original parameters of pre-trained models when injecting knowledge. However, when multiple kinds of knowledge are injected, the historically injected knowledge would be flushed away. To address this, we propose K-Adapter, a framework that retains the original parameters of the pre-trained model fixed and supports the development of versatile knowledge-infused model. Taking RoBERTa as the backbone model, K-Adapter has a neural adapter for each kind of infused knowledge, like a plug-in connected to RoBERTa. There is no information flow between different adapters, thus multiple adapters can be efficiently trained in a distributed way. As a case study, we inject two kinds of knowledge in this work, including (1) factual knowledge obtained from automatically aligned text-triplets on Wikipedia and Wikidata and (2) linguistic knowledge obtained via dependency parsing. Results on three knowledge-driven tasks, including relation classification, entity typing, and question answering, demonstrate that each adapter improves the performance and the combination of both adapters brings further improvements. Further analysis indicates that K-Adapter captures versatile knowledge than RoBERTa.

Results

TaskDatasetMetricValueModel
Relation ExtractionTACREDF172.04K-ADAPTER (F+L)
Relation ExtractionTACREDF1 (1% Few-Shot)13.8K-ADAPTER (F+L)
Relation ExtractionTACREDF1 (10% Few-Shot)56K-ADAPTER (F+L)
Relation ExtractionTACREDF1 (5% Few-Shot)45.1K-ADAPTER (F+L)
Relation ExtractionTACREDF171.3RoBERTa
Relation ExtractionTACREDF172K-Adapter
Relation ClassificationTACREDF171.3RoBERTa
Relation ClassificationTACREDF172K-Adapter
Entity Typing Open EntityF177.6916K-Adapter ( fac-adapter )
Entity Typing Open EntityPrecision79.6712K-Adapter ( fac-adapter )
Entity Typing Open EntityRecall75.8081K-Adapter ( fac-adapter )
Entity Typing Open EntityF177.6127K-Adapter ( fac-adapter + lin-adapter )
Entity Typing Open EntityPrecision78.9956K-Adapter ( fac-adapter + lin-adapter )
Entity Typing Open EntityRecall76.2774K-Adapter ( fac-adapter + lin-adapter )

Related Papers

From Roots to Rewards: Dynamic Tree Reasoning with RL2025-07-17Enter the Mind Palace: Reasoning and Planning for Long-term Active Embodied Question Answering2025-07-17Vision-and-Language Training Helps Deploy Taxonomic Knowledge but Does Not Fundamentally Alter It2025-07-17City-VLM: Towards Multidomain Perception Scene Understanding via Multimodal Incomplete Learning2025-07-17Describe Anything Model for Visual Question Answering on Text-rich Images2025-07-16Is This Just Fantasy? Language Model Representations Reflect Human Judgments of Event Plausibility2025-07-16Warehouse Spatial Question Answering with LLM Agent2025-07-14Evaluating Attribute Confusion in Fashion Text-to-Image Generation2025-07-09