TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/All-but-the-Top: Simple and Effective Postprocessing for W...

All-but-the-Top: Simple and Effective Postprocessing for Word Representations

Jiaqi Mu, Suma Bhat, Pramod Viswanath

2017-02-05ICLR 2018 1Text ClassificationSubjectivity AnalysisWord SimilaritySentiment AnalysisAllGeneral Classification
PaperPDFCodeCodeCodeCode

Abstract

Real-valued word representations have transformed NLP applications; popular examples are word2vec and GloVe, recognized for their ability to capture linguistic regularities. In this paper, we demonstrate a {\em very simple}, and yet counter-intuitive, postprocessing technique -- eliminate the common mean vector and a few top dominating directions from the word vectors -- that renders off-the-shelf representations {\em even stronger}. The postprocessing is empirically validated on a variety of lexical-level intrinsic tasks (word similarity, concept categorization, word analogy) and sentence-level tasks (semantic textural similarity and { text classification}) on multiple datasets and with a variety of representation methods and hyperparameter choices in multiple languages; in each case, the processed representations are consistently better than the original ones.

Results

TaskDatasetMetricValueModel
Sentiment AnalysisMRAccuracy78.26GRU-RNN-WORD2VEC
Sentiment AnalysisSST-5 Fine-grained classificationAccuracy45.02GRU-RNN-WORD2VEC
Subjectivity AnalysisSUBJAccuracy91.85GRU-RNN-GLOVE
Text ClassificationTREC-6Error7GRU-RNN-GLOVE
ClassificationTREC-6Error7GRU-RNN-GLOVE

Related Papers

Making Language Model a Hierarchical Classifier and Generator2025-07-17AdaptiSent: Context-Aware Adaptive Attention for Multimodal Aspect-Based Sentiment Analysis2025-07-17AI Wizards at CheckThat! 2025: Enhancing Transformer-Based Embeddings with Sentiment for Subjectivity Detection in News Articles2025-07-15DCR: Quantifying Data Contamination in LLMs Evaluation2025-07-15Modeling Code: Is Text All You Need?2025-07-15All Eyes, no IMU: Learning Flight Attitude from Vision Alone2025-07-15SentiDrop: A Multi Modal Machine Learning model for Predicting Dropout in Distance Learning2025-07-14GNN-CNN: An Efficient Hybrid Model of Convolutional and Graph Neural Networks for Text Representation2025-07-10