TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Semi-Supervised Sequence Modeling with Cross-View Training

Semi-Supervised Sequence Modeling with Cross-View Training

Kevin Clark, Minh-Thang Luong, Christopher D. Manning, Quoc V. Le

2018-09-22EMNLP 2018 10Machine TranslationRepresentation LearningPart-Of-Speech TaggingCCG SupertaggingTranslationMulti-Task LearningNamed Entity Recognition (NER)Dependency Parsing
PaperPDFCodeCode

Abstract

Unsupervised representation learning algorithms such as word2vec and ELMo improve the accuracy of many supervised NLP models, mainly because they can take advantage of large amounts of unlabeled text. However, the supervised models only learn from task-specific labeled data during the main training phase. We therefore propose Cross-View Training (CVT), a semi-supervised learning algorithm that improves the representations of a Bi-LSTM sentence encoder using a mix of labeled and unlabeled data. On labeled examples, standard supervised learning is used. On unlabeled examples, CVT teaches auxiliary prediction modules that see restricted views of the input (e.g., only part of a sentence) to match the predictions of the full model seeing the whole input. Since the auxiliary modules and the full model share intermediate representations, this in turn improves the full model. Moreover, we show that CVT is particularly effective when combined with multi-task learning. We evaluate CVT on five sequence tagging tasks, machine translation, and dependency parsing, achieving state-of-the-art results.

Results

TaskDatasetMetricValueModel
Part-Of-Speech TaggingPenn TreebankAccuracy97.76CVT + Multi-task
Machine TranslationIWSLT2015 English-VietnameseBLEU29.6CVT
Dependency ParsingPenn TreebankLAS95.02CVT + Multi-Task
Dependency ParsingPenn TreebankUAS96.61CVT + Multi-Task
Named Entity Recognition (NER)Ontonotes v5 (English)F188.81CVT + Multi-Task + Large
Named Entity Recognition (NER)CoNLL 2003 (English)F192.61CVT + Multi-Task
Named Entity Recognition (NER)CoNLL 2003 (English)F192.61CVT + Multi-Task + Large
CCG SupertaggingCCGbankAccuracy96.1CVT + Multi-task + Large

Related Papers

Touch in the Wild: Learning Fine-Grained Manipulation with a Portable Visuo-Tactile Gripper2025-07-20Spectral Bellman Method: Unifying Representation and Exploration in RL2025-07-17Boosting Team Modeling through Tempo-Relational Representation Learning2025-07-17A Translation of Probabilistic Event Calculus into Markov Decision Processes2025-07-17SGCL: Unifying Self-Supervised and Supervised Learning for Graph Recommendation2025-07-17Similarity-Guided Diffusion for Contrastive Sequential Recommendation2025-07-16Are encoders able to learn landmarkers for warm-starting of Hyperparameter Optimization?2025-07-16Language-Guided Contrastive Audio-Visual Masked Autoencoder with Automatically Generated Audio-Visual-Text Triplets from Videos2025-07-16