Document-Level Relation Extraction with Adaptive Focal Loss and Knowledge Distillation

Qingyu Tan, Ruidan He, Lidong Bing, Hwee Tou Ng

2022-03-21Findings (ACL) 2022 5Relation Extraction Document-level Relation Extraction Knowledge Distillation

Abstract

Document-level Relation Extraction (DocRE) is a more challenging task compared to its sentence-level counterpart. It aims to extract relations from multiple sentences at once. In this paper, we propose a semi-supervised framework for DocRE with three novel components. Firstly, we use an axial attention module for learning the interdependency among entity-pairs, which improves the performance on two-hop relations. Secondly, we propose an adaptive focal loss to tackle the class imbalance problem of DocRE. Lastly, we use knowledge distillation to overcome the differences between human annotated data and distantly supervised data. We conducted experiments on two DocRE datasets. Our model consistently outperforms strong baselines and its performance exceeds the previous SOTA by 1.36 F1 and 1.46 Ign_F1 score on the DocRED leaderboard. Our code and data will be released at https://github.com/tonytan48/KD-DocRE.

Results

Task	Dataset	Metric	Value	Model
Relation Extraction	DocRED	F1	67.28	KD-Rb-l
Relation Extraction	DocRED	Ign F1	65.24	KD-Rb-l
Relation Extraction	ReDocRED	F1	78.28	KD-DocRE
Relation Extraction	ReDocRED	Ign F1	77.6	KD-DocRE

Related Papers

Visual-Language Model Knowledge Distillation Method for Image Quality Assessment2025-07-21 Uncertainty-Aware Cross-Modal Knowledge Distillation with Prototype Learning for Multimodal Brain-Computer Interfaces2025-07-17 DVFL-Net: A Lightweight Distilled Video Focal Modulation Network for Spatio-Temporal Action Recognition2025-07-16 HanjaBridge: Resolving Semantic Ambiguity in Korean LLMs via Hanja-Augmented Pre-Training2025-07-15 Feature Distillation is the Better Choice for Model-Heterogeneous Federated Learning2025-07-14 KAT-V1: Kwai-AutoThink Technical Report2025-07-11 Towards Collaborative Fairness in Federated Learning Under Imbalanced Covariate Shift2025-07-11 SFedKD: Sequential Federated Learning with Discrepancy-Aware Multi-Teacher Knowledge Distillation2025-07-11