Neural Semantic Encoders

Tsendsuren Munkhdalai, Hong Yu

2016-07-14EACL 2017 4Machine Translation Question Answering Sentiment Analysis Natural Language Inference Natural Language Understanding Translation General Classification Sentence Classification

Paper PDF Code Code Code(official)

Abstract

We present a memory augmented neural network for natural language understanding: Neural Semantic Encoders. NSE is equipped with a novel memory update rule and has a variable sized encoding memory that evolves over time and maintains the understanding of input sequences through read}, compose and write operations. NSE can also access multiple and shared memories. In this paper, we demonstrated the effectiveness and the flexibility of NSE on five different natural language tasks: natural language inference, question answering, sentence classification, document sentiment analysis and machine translation where NSE achieved state-of-the-art performance when evaluated on publically available benchmarks. For example, our shared-memory model showed an encouraging result on neural machine translation, improving an attention-based baseline by approximately 1.0 BLEU.

Results

Task	Dataset	Metric	Value	Model
Machine Translation	WMT2014 English-German	BLEU score	17.9	NSE-NSE
Question Answering	WikiQA	MAP	0.6811	MMA-NSE attention
Question Answering	WikiQA	MRR	0.6993	MMA-NSE attention
Natural Language Inference	SNLI	% Test Accuracy	85.4	300D MMA-NSE encoders with attention
Natural Language Inference	SNLI	% Train Accuracy	86.9	300D MMA-NSE encoders with attention
Natural Language Inference	SNLI	% Test Accuracy	84.6	300D NSE encoders
Natural Language Inference	SNLI	% Train Accuracy	86.2	300D NSE encoders
Sentiment Analysis	SST-2 Binary classification	Accuracy	89.7	Neural Semantic Encoder

Related Papers

From Roots to Rewards: Dynamic Tree Reasoning with RL2025-07-17 Enter the Mind Palace: Reasoning and Planning for Long-term Active Embodied Question Answering2025-07-17 Vision-and-Language Training Helps Deploy Taxonomic Knowledge but Does Not Fundamentally Alter It2025-07-17 City-VLM: Towards Multidomain Perception Scene Understanding via Multimodal Incomplete Learning2025-07-17 AdaptiSent: Context-Aware Adaptive Attention for Multimodal Aspect-Based Sentiment Analysis2025-07-17 A Translation of Probabilistic Event Calculus into Markov Decision Processes2025-07-17 Describe Anything Model for Visual Question Answering on Text-rich Images2025-07-16 Is This Just Fantasy? Language Model Representations Reflect Human Judgments of Event Plausibility2025-07-16