A La Carte Embedding: Cheap but Effective Induction of Semantic Feature Vectors

Mikhail Khodak, Nikunj Saunshi, YIngyu Liang, Tengyu Ma, Brandon Stewart, Sanjeev Arora

2018-05-14ACL 2018 7Text Classification Sentiment Analysis Transfer Learning Document Classification Domain Adaptation

Abstract

Motivations like domain adaptation, transfer learning, and feature learning have fueled interest in inducing embeddings for rare or unseen words, n-grams, synsets, and other textual features. This paper introduces a la carte embedding, a simple and general alternative to the usual word2vec-based approaches for building such representations that is based upon recent theoretical results for GloVe-like embeddings. Our method relies mainly on a linear transformation that is efficiently learnable using pretrained word vectors and linear regression. This transform is applicable on the fly in the future when a new text feature or rare word is encountered, even if only a single usage example is available. We introduce a new dataset showing how the a la carte method requires fewer examples of words in context to learn high-quality embeddings and we obtain state-of-the-art results on a nonce task and some unsupervised document classification tasks.

Results

Task	Dataset	Metric	Value	Model
Sentiment Analysis	CR	Accuracy	90.6	byte mLSTM7
Sentiment Analysis	MR	Accuracy	86.8	byte mLSTM7
Sentiment Analysis	SST-5 Fine-grained classification	Accuracy	54.6	byte mLSTM7
Sentiment Analysis	SST-2 Binary classification	Accuracy	91.7	byte mLSTM7
Sentiment Analysis	MPQA	Accuracy	88.8	byte mLSTM7
Subjectivity Analysis	SUBJ	Accuracy	94.7	byte mLSTM7
Text Classification	TREC-6	Error	9.6	byte mLSTM7
Classification	TREC-6	Error	9.6	byte mLSTM7

Related Papers

RaMen: Multi-Strategy Multi-Modal Learning for Bundle Construction2025-07-18 Making Language Model a Hierarchical Classifier and Generator2025-07-17 AdaptiSent: Context-Aware Adaptive Attention for Multimodal Aspect-Based Sentiment Analysis2025-07-17 Disentangling coincident cell events using deep transfer learning and compressive sensing2025-07-17 A Privacy-Preserving Semantic-Segmentation Method Using Domain-Adaptation Technique2025-07-17 Best Practices for Large-Scale, Pixel-Wise Crop Mapping and Transfer Learning Workflows2025-07-16 AI Wizards at CheckThat! 2025: Enhancing Transformer-Based Embeddings with Sentiment for Subjectivity Detection in News Articles2025-07-15 DCR: Quantifying Data Contamination in LLMs Evaluation2025-07-15