TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Cell-aware Stacked LSTMs for Modeling Sentences

Cell-aware Stacked LSTMs for Modeling Sentences

Jihun Choi, Taeuk Kim, Sang-goo Lee

2018-09-07Machine TranslationParaphrase IdentificationSentiment AnalysisNatural Language InferenceTranslationSentiment Classification
PaperPDF

Abstract

We propose a method of stacking multiple long short-term memory (LSTM) layers for modeling sentences. In contrast to the conventional stacked LSTMs where only hidden states are fed as input to the next layer, the suggested architecture accepts both hidden and memory cell states of the preceding layer and fuses information from the left and the lower context using the soft gating mechanism of LSTMs. Thus the architecture modulates the amount of information to be delivered not only in horizontal recurrence but also in vertical connections, from which useful features extracted from lower layers are effectively conveyed to upper layers. We dub this architecture Cell-aware Stacked LSTM (CAS-LSTM) and show from experiments that our models bring significant performance gain over the standard LSTMs on benchmark datasets for natural language inference, paraphrase detection, sentiment classification, and machine translation. We also conduct extensive qualitative analysis to understand the internal behavior of the suggested approach.

Results

TaskDatasetMetricValueModel
Natural Language InferenceSNLI% Test Accuracy87300D 2-layer Bi-CAS-LSTM
Semantic Textual SimilarityQuora Question PairsAccuracy88.6Bi-CAS-LSTM
Sentiment AnalysisSST-5 Fine-grained classificationAccuracy53.6Bi-CAS-LSTM
Sentiment AnalysisSST-2 Binary classificationAccuracy91.3Bi-CAS-LSTM
Paraphrase IdentificationQuora Question PairsAccuracy88.6Bi-CAS-LSTM

Related Papers

AdaptiSent: Context-Aware Adaptive Attention for Multimodal Aspect-Based Sentiment Analysis2025-07-17A Translation of Probabilistic Event Calculus into Markov Decision Processes2025-07-17AI Wizards at CheckThat! 2025: Enhancing Transformer-Based Embeddings with Sentiment for Subjectivity Detection in News Articles2025-07-15DCR: Quantifying Data Contamination in LLMs Evaluation2025-07-15LRCTI: A Large Language Model-Based Framework for Multi-Step Evidence Retrieval and Reasoning in Cyber Threat Intelligence Credibility Verification2025-07-15Function-to-Style Guidance of LLMs for Code Translation2025-07-15SentiDrop: A Multi Modal Machine Learning model for Predicting Dropout in Distance Learning2025-07-14GNN-CNN: An Efficient Hybrid Model of Convolutional and Graph Neural Networks for Text Representation2025-07-10