Semi-Supervised Sequence Modeling with Cross-View Training

Kevin Clark, Minh-Thang Luong, Christopher D. Manning, Quoc V. Le

2018-09-22EMNLP 2018 10Machine Translation Representation Learning Part-Of-Speech Tagging CCG Supertagging Translation Multi-Task Learning Named Entity Recognition (NER)Dependency Parsing

Paper PDF Code Code

Abstract

Unsupervised representation learning algorithms such as word2vec and ELMo improve the accuracy of many supervised NLP models, mainly because they can take advantage of large amounts of unlabeled text. However, the supervised models only learn from task-specific labeled data during the main training phase. We therefore propose Cross-View Training (CVT), a semi-supervised learning algorithm that improves the representations of a Bi-LSTM sentence encoder using a mix of labeled and unlabeled data. On labeled examples, standard supervised learning is used. On unlabeled examples, CVT teaches auxiliary prediction modules that see restricted views of the input (e.g., only part of a sentence) to match the predictions of the full model seeing the whole input. Since the auxiliary modules and the full model share intermediate representations, this in turn improves the full model. Moreover, we show that CVT is particularly effective when combined with multi-task learning. We evaluate CVT on five sequence tagging tasks, machine translation, and dependency parsing, achieving state-of-the-art results.

Results

Task	Dataset	Metric	Value	Model
Part-Of-Speech Tagging	Penn Treebank	Accuracy	97.76	CVT + Multi-task
Machine Translation	IWSLT2015 English-Vietnamese	BLEU	29.6	CVT
Dependency Parsing	Penn Treebank	LAS	95.02	CVT + Multi-Task
Dependency Parsing	Penn Treebank	UAS	96.61	CVT + Multi-Task
Named Entity Recognition (NER)	Ontonotes v5 (English)	F1	88.81	CVT + Multi-Task + Large
Named Entity Recognition (NER)	CoNLL 2003 (English)	F1	92.61	CVT + Multi-Task
Named Entity Recognition (NER)	CoNLL 2003 (English)	F1	92.61	CVT + Multi-Task + Large
CCG Supertagging	CCGbank	Accuracy	96.1	CVT + Multi-task + Large

Semi-Supervised Sequence Modeling with Cross-View Training

Abstract

Results

Related Papers

Semi-Supervised Sequence Modeling with Cross-View Training

Abstract

Results

Related Papers