ERNIE 2.0: A Continual Pre-training Framework for Language Understanding

Yu Sun, Shuohuan Wang, Yukun Li, Shikun Feng, Hao Tian, Hua Wu, Haifeng Wang

2019-07-29Chinese Sentiment Analysis Question Answering Chinese Reading Comprehension Sentiment Analysis Natural Language Inference Chinese Sentence Pair Classification Chinese Named Entity Recognition Semantic Textual Similarity Linguistic Acceptability Multi-Task Learning Open-Domain Question Answering Named Entity Recognition (NER)

Paper PDF Code(official)Code Code

Abstract

Recently, pre-trained models have achieved state-of-the-art results in various language understanding tasks, which indicates that pre-training on large-scale corpora may play a crucial role in natural language processing. Current pre-training procedures usually focus on training the model with several simple tasks to grasp the co-occurrence of words or sentences. However, besides co-occurring, there exists other valuable lexical, syntactic and semantic information in training corpora, such as named entity, semantic closeness and discourse relations. In order to extract to the fullest extent, the lexical, syntactic and semantic information from training corpora, we propose a continual pre-training framework named ERNIE 2.0 which builds and learns incrementally pre-training tasks through constant multi-task learning. Experimental results demonstrate that ERNIE 2.0 outperforms BERT and XLNet on 16 tasks including English tasks on GLUE benchmarks and several common tasks in Chinese. The source codes and pre-trained models have been released at https://github.com/PaddlePaddle/ERNIE.

Results

Task	Dataset	Metric	Value	Model
Question Answering	DuReader	EM	64.2	ERNIE 2.0 Large
Question Answering	DuReader	EM	61.3	ERNIE 2.0 Base
Natural Language Inference	WNLI	Accuracy	67.8	ERNIE 2.0 Large
Natural Language Inference	XNLI Chinese Dev	Accuracy	82.6	ERNIE 2.0 Large
Natural Language Inference	XNLI Chinese Dev	Accuracy	81.2	ERNIE 2.0 Base
Natural Language Inference	XNLI Chinese	Accuracy	81	ERNIE 2.0 Large
Natural Language Inference	XNLI Chinese	Accuracy	79.7	ERNIE 2.0 Base
Natural Language Inference	MultiNLI	Matched	88.7	ERNIE 2.0 Large
Natural Language Inference	MultiNLI	Mismatched	88.8	ERNIE 2.0 Large
Natural Language Inference	MultiNLI	Matched	86.1	ERNIE 2.0 Base
Natural Language Inference	MultiNLI	Mismatched	85.5	ERNIE 2.0 Base
Semantic Textual Similarity	STS Benchmark	Pearson Correlation	0.912	ERNIE 2.0 Large
Semantic Textual Similarity	STS Benchmark	Pearson Correlation	0.876	ERNIE 2.0 Base
Sentiment Analysis	SST-2 Binary classification	Accuracy	95	ERNIE 2.0 Base
Named Entity Recognition (NER)	MSRA Dev	F1	96.3	ERNIE 2.0 Large
Named Entity Recognition (NER)	MSRA Dev	F1	95.2	ERNIE 2.0 Base
Named Entity Recognition (NER)	MSRA	F1	95	ERNIE 2.0 Large
Named Entity Recognition (NER)	MSRA	F1	93.8	ERNIE 2.0 Base
Open-Domain Question Answering	DuReader	EM	64.2	ERNIE 2.0 Large
Open-Domain Question Answering	DuReader	EM	61.3	ERNIE 2.0 Base

ERNIE 2.0: A Continual Pre-training Framework for Language Understanding

Abstract

Results

Related Papers

ERNIE 2.0: A Continual Pre-training Framework for Language Understanding

Abstract

Results

Related Papers