TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/A La Carte Embedding: Cheap but Effective Induction of Sem...

A La Carte Embedding: Cheap but Effective Induction of Semantic Feature Vectors

Mikhail Khodak, Nikunj Saunshi, YIngyu Liang, Tengyu Ma, Brandon Stewart, Sanjeev Arora

2018-05-14ACL 2018 7Text ClassificationSentiment AnalysisTransfer LearningDocument ClassificationDomain Adaptation
PaperPDFCode(official)

Abstract

Motivations like domain adaptation, transfer learning, and feature learning have fueled interest in inducing embeddings for rare or unseen words, n-grams, synsets, and other textual features. This paper introduces a la carte embedding, a simple and general alternative to the usual word2vec-based approaches for building such representations that is based upon recent theoretical results for GloVe-like embeddings. Our method relies mainly on a linear transformation that is efficiently learnable using pretrained word vectors and linear regression. This transform is applicable on the fly in the future when a new text feature or rare word is encountered, even if only a single usage example is available. We introduce a new dataset showing how the a la carte method requires fewer examples of words in context to learn high-quality embeddings and we obtain state-of-the-art results on a nonce task and some unsupervised document classification tasks.

Results

TaskDatasetMetricValueModel
Sentiment AnalysisCRAccuracy90.6byte mLSTM7
Sentiment AnalysisMRAccuracy86.8byte mLSTM7
Sentiment AnalysisSST-5 Fine-grained classificationAccuracy54.6byte mLSTM7
Sentiment AnalysisSST-2 Binary classificationAccuracy91.7byte mLSTM7
Sentiment AnalysisMPQAAccuracy88.8byte mLSTM7
Subjectivity AnalysisSUBJAccuracy94.7byte mLSTM7
Text ClassificationTREC-6Error9.6byte mLSTM7
ClassificationTREC-6Error9.6byte mLSTM7

Related Papers

RaMen: Multi-Strategy Multi-Modal Learning for Bundle Construction2025-07-18Making Language Model a Hierarchical Classifier and Generator2025-07-17AdaptiSent: Context-Aware Adaptive Attention for Multimodal Aspect-Based Sentiment Analysis2025-07-17Disentangling coincident cell events using deep transfer learning and compressive sensing2025-07-17A Privacy-Preserving Semantic-Segmentation Method Using Domain-Adaptation Technique2025-07-17Best Practices for Large-Scale, Pixel-Wise Crop Mapping and Transfer Learning Workflows2025-07-16AI Wizards at CheckThat! 2025: Enhancing Transformer-Based Embeddings with Sentiment for Subjectivity Detection in News Articles2025-07-15DCR: Quantifying Data Contamination in LLMs Evaluation2025-07-15