SpanBERT: Improving Pre-training by Representing and Predicting Spans

Mandar Joshi, Danqi Chen, Yinhan Liu, Daniel S. Weld, Luke Zettlemoyer, Omer Levy

2019-07-24TACL 2020 1Question Answering Relation Extraction Paraphrase Identification Sentiment Analysis Coreference Resolution Natural Language Inference Semantic Textual Similarity Linguistic Acceptability Open-Domain Question Answering Relation Classification

Paper PDF Code Code Code(official)Code Code Code

Abstract

We present SpanBERT, a pre-training method that is designed to better represent and predict spans of text. Our approach extends BERT by (1) masking contiguous random spans, rather than random tokens, and (2) training the span boundary representations to predict the entire content of the masked span, without relying on the individual token representations within it. SpanBERT consistently outperforms BERT and our better-tuned baselines, with substantial gains on span selection tasks such as question answering and coreference resolution. In particular, with the same training data and model size as BERT-large, our single model obtains 94.6% and 88.7% F1 on SQuAD 1.1 and 2.0, respectively. We also achieve a new state of the art on the OntoNotes coreference resolution task (79.6\% F1), strong performance on the TACRED relation extraction benchmark, and even show gains on GLUE.

Results

Task	Dataset	Metric	Value	Model
Relation Extraction	TACRED	F1	70.8	SpanBERT-large
Relation Extraction	Re-TACRED	F1	85.3	SpanBERT
Relation Extraction	TACRED	F1	70.8	SpanBERT
Relation Classification	TACRED	F1	70.8	SpanBERT
Question Answering	NewsQA	F1	73.6	SpanBERT
Question Answering	NaturalQA	F1	82.5	SpanBERT
Question Answering	SQuAD1.1	EM	88.8	SpanBERT (single model)
Question Answering	SQuAD1.1	F1	94.6	SpanBERT (single model)
Question Answering	TriviaQA	F1	83.6	SpanBERT
Question Answering	SQuAD2.0 dev	F1	86.8	SpanBERT
Question Answering	SQuAD2.0	EM	85.7	SpanBERT
Question Answering	SQuAD2.0	F1	88.7	SpanBERT
Question Answering	SearchQA	F1	84.8	SpanBERT
Natural Language Inference	MultiNLI	Matched	88.1	SpanBERT
Semantic Textual Similarity	STS Benchmark	Pearson Correlation	0.899	SpanBERT
Semantic Textual Similarity	Quora Question Pairs	Accuracy	89.5	SpanBERT
Semantic Textual Similarity	Quora Question Pairs	F1	71.9	SpanBERT
Sentiment Analysis	SST-2 Binary classification	Accuracy	94.8	SpanBERT
Coreference Resolution	OntoNotes	F1	79.6	SpanBERT
Paraphrase Identification	Quora Question Pairs	Accuracy	89.5	SpanBERT
Paraphrase Identification	Quora Question Pairs	F1	71.9	SpanBERT
Open-Domain Question Answering	SearchQA	F1	84.8	SpanBERT

SpanBERT: Improving Pre-training by Representing and Predicting Spans

Abstract

Results

Related Papers

SpanBERT: Improving Pre-training by Representing and Predicting Spans

Abstract

Results

Related Papers