TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Baseline Needs More Love: On Simple Word-Embedding-Based M...

Baseline Needs More Love: On Simple Word-Embedding-Based Models and Associated Pooling Mechanisms

Dinghan Shen, Guoyin Wang, Wenlin Wang, Martin Renqiang Min, Qinliang Su, Yizhe Zhang, Chunyuan Li, Ricardo Henao, Lawrence Carin

2018-05-24ACL 2018 7Text ClassificationSubjectivity AnalysisSentiment AnalysisWord EmbeddingsDocument ClassificationGeneral ClassificationNamed Entity Recognition (NER)
PaperPDFCodeCode(official)

Abstract

Many deep learning architectures have been proposed to model the compositionality in text sequences, requiring a substantial number of parameters and expensive computations. However, there has not been a rigorous evaluation regarding the added value of sophisticated compositional functions. In this paper, we conduct a point-by-point comparative study between Simple Word-Embedding-based Models (SWEMs), consisting of parameter-free pooling operations, relative to word-embedding-based RNN/CNN models. Surprisingly, SWEMs exhibit comparable or even superior performance in the majority of cases considered. Based upon this understanding, we propose two additional pooling strategies over learned word embeddings: (i) a max-pooling operation for improved interpretability; and (ii) a hierarchical pooling operation, which preserves spatial (n-gram) information within text sequences. We present experiments on 17 datasets encompassing three tasks: (i) (long) document classification; (ii) text sequence matching; and (iii) short text tasks, including classification and tagging. The source code and datasets can be obtained from https:// github.com/dinghanshen/SWEM.

Results

TaskDatasetMetricValueModel
Question AnsweringWikiQAMAP0.6788SWEM-concat
Question AnsweringWikiQAMRR0.6908SWEM-concat
Natural Language InferenceSNLI% Test Accuracy83.8SWEM-max
Natural Language InferenceMultiNLIMatched68.2SWEM-max
Natural Language InferenceMultiNLIMismatched67.7SWEM-max
Semantic Textual SimilarityMSRPAccuracy71.5SWEM-concat
Semantic Textual SimilarityMSRPF181.3SWEM-concat
Sentiment AnalysisMRAccuracy78.2SWEM-concat
Sentiment AnalysisSST-5 Fine-grained classificationAccuracy46.1SWEM-concat
Sentiment AnalysisYelp Fine-grained classificationError36.21SWEM-hier
Sentiment AnalysisSST-2 Binary classificationAccuracy84.3SWEM-concat
Sentiment AnalysisYelp Binary classificationError4.19SWEM-hier
Named Entity Recognition (NER)CoNLL 2000F190.34SWEM-CRF
Named Entity Recognition (NER)CoNLL 2003 (English)F186.28SWEM-CRF
Subjectivity AnalysisSUBJAccuracy93SWEM-concat
Paraphrase IdentificationMSRPAccuracy71.5SWEM-concat
Paraphrase IdentificationMSRPF181.3SWEM-concat
Text ClassificationTREC-6Error7.8SWEM-aver
Text ClassificationDBpediaError1.43SWEM-concat
Text ClassificationAG NewsError7.34SWEM-concat
Text ClassificationYahoo! AnswersAccuracy73.53SWEM-concat
ClassificationTREC-6Error7.8SWEM-aver
ClassificationDBpediaError1.43SWEM-concat
ClassificationAG NewsError7.34SWEM-concat
ClassificationYahoo! AnswersAccuracy73.53SWEM-concat

Related Papers

Making Language Model a Hierarchical Classifier and Generator2025-07-17AdaptiSent: Context-Aware Adaptive Attention for Multimodal Aspect-Based Sentiment Analysis2025-07-17AI Wizards at CheckThat! 2025: Enhancing Transformer-Based Embeddings with Sentiment for Subjectivity Detection in News Articles2025-07-15DCR: Quantifying Data Contamination in LLMs Evaluation2025-07-15SentiDrop: A Multi Modal Machine Learning model for Predicting Dropout in Distance Learning2025-07-14GNN-CNN: An Efficient Hybrid Model of Convolutional and Graph Neural Networks for Text Representation2025-07-10Speak2Sign3D: A Multi-modal Pipeline for English Speech to American Sign Language Animation2025-07-09Flippi: End To End GenAI Assistant for E-Commerce2025-07-08