A Hierarchical Multi-task Approach for Learning Embeddings from Semantic Tasks

Victor Sanh, Thomas Wolf, Sebastian Ruder

2018-11-14Relation Extraction named-entity-recognition Named Entity Recognition Multi-Task Learning Named Entity Recognition (NER)

Paper PDF Code

Abstract

Much effort has been devoted to evaluate whether multi-task learning can be leveraged to learn rich representations that can be used in various Natural Language Processing (NLP) down-stream applications. However, there is still a lack of understanding of the settings in which multi-task learning has a significant effect. In this work, we introduce a hierarchical model trained in a multi-task learning setup on a set of carefully selected semantic tasks. The model is trained in a hierarchical fashion to introduce an inductive bias by supervising a set of low level tasks at the bottom layers of the model and more complex tasks at the top layers of the model. This model achieves state-of-the-art results on a number of tasks, namely Named Entity Recognition, Entity Mention Detection and Relation Extraction without hand-engineered features or external NLP tools like syntactic parsers. The hierarchical training supervision induces a set of shared semantic representations at lower layers of the model. We show that as we move from the bottom to the top layers of the model, the hidden states of the layers tend to represent more complex semantic information.

Results

Task	Dataset	Metric	Value	Model
Relation Extraction	ACE 2005	NER Micro F1	87.5	Hierarchical Multi-task
Relation Extraction	ACE 2005	RE Micro F1	62.7	Hierarchical Multi-task

Related Papers

SGCL: Unifying Self-Supervised and Supervised Learning for Graph Recommendation2025-07-17 Robust-Multi-Task Gradient Boosting2025-07-15 SAMO: A Lightweight Sharpness-Aware Approach for Multi-Task Optimization with Joint Global-Local Perturbation2025-07-10 DocIE@XLLM25: In-Context Learning for Information Extraction using Fully Synthetic Demonstrations2025-07-08 Flippi: End To End GenAI Assistant for E-Commerce2025-07-08 Selecting and Merging: Towards Adaptable and Scalable Named Entity Recognition with Large Language Models2025-06-28 Multiple Streams of Relation Extraction: Enriching and Recalling in Transformers2025-06-25 Opportunistic Osteoporosis Diagnosis via Texture-Preserving Self-Supervision, Mixture of Experts and Multi-Task Integration2025-06-25