Long Short-Term Memory-Networks for Machine Reading

Jianpeng Cheng, Li Dong, Mirella Lapata

2016-01-25EMNLP 2016 11Reading Comprehension Sentiment Analysis Natural Language Inference Language Modelling

Abstract

In this paper we address the question of how to render sequence-level networks better at handling structured input. We propose a machine reading simulator which processes text incrementally from left to right and performs shallow reasoning with memory and attention. The reader extends the Long Short-Term Memory architecture with a memory network in place of a single memory cell. This enables adaptive memory usage during recurrence with neural attention, offering a way to weakly induce relations among tokens. The system is initially designed to process a single sequence but we also demonstrate how to integrate it with an encoder-decoder architecture. Experiments on language modeling, sentiment analysis, and natural language inference show that our model matches or outperforms the state of the art.

Results

Task	Dataset	Metric	Value	Model
Natural Language Inference	SNLI	% Test Accuracy	86.3	450D LSTMN with deep attention fusion
Natural Language Inference	SNLI	% Train Accuracy	88.5	450D LSTMN with deep attention fusion
Natural Language Inference	SNLI	% Test Accuracy	85.7	300D LSTMN with deep attention fusion
Natural Language Inference	SNLI	% Train Accuracy	87.3	300D LSTMN with deep attention fusion

Related Papers

Visual-Language Model Knowledge Distillation Method for Image Quality Assessment2025-07-21 AdaptiSent: Context-Aware Adaptive Attention for Multimodal Aspect-Based Sentiment Analysis2025-07-17 Making Language Model a Hierarchical Classifier and Generator2025-07-17 VisionThink: Smart and Efficient Vision Language Model via Reinforcement Learning2025-07-17 The Generative Energy Arena (GEA): Incorporating Energy Awareness in Large Language Model (LLM) Human Evaluations2025-07-17 Inverse Reinforcement Learning Meets Large Language Model Post-Training: Basics, Advances, and Opportunities2025-07-17 Assay2Mol: large language model-based drug design using BioAssay context2025-07-16 Describe Anything Model for Visual Question Answering on Text-rich Images2025-07-16