Variational Sequential Labelers for Semi-Supervised Learning

Mingda Chen, Qingming Tang, Karen Livescu, Kevin Gimpel

2019-06-23EMNLP 2018 10Word Embeddings

Abstract

We introduce a family of multitask variational methods for semi-supervised sequence labeling. Our model family consists of a latent-variable generative model and a discriminative labeler. The generative models use latent variables to define the conditional probability of a word given its context, drawing inspiration from word prediction objectives commonly used in learning word embeddings. The labeler helps inject discriminative information into the latent space. We explore several latent variable configurations, including ones with hierarchical structure, which enables the model to account for both label-specific and word-specific information. Our models consistently outperform standard sequential baselines on 8 sequence labeling datasets, and improve further with unlabeled data.

Results

Task	Dataset	Metric	Value	Model
Named Entity Recognition (NER)	CoNLL 2003 (English)	F1	84.7	VSL-GG-Hier

Related Papers

Speak2Sign3D: A Multi-modal Pipeline for English Speech to American Sign Language Animation2025-07-09 Computational Detection of Intertextual Parallels in Biblical Hebrew: A Benchmark Study Using Transformer-Based Language Models2025-06-30 Including Semantic Information via Word Embeddings for Skeleton-based Action Recognition2025-06-23 Low-resource keyword spotting using contrastively trained transformer acoustic word embeddings2025-06-21 Characterizing Linguistic Shifts in Croatian News via Diachronic Word Embeddings2025-06-16 Learning Obfuscations Of LLM Embedding Sequences: Stained Glass Transform2025-06-11 Recommender systems, stigmergy, and the tyranny of popularity2025-06-06 Static Word Embeddings for Sentence Semantic Representation2025-06-05