Neural Machine Translation by Jointly Learning to Align and Translate

Dzmitry Bahdanau, Kyunghyun Cho, Yoshua Bengio

2014-09-01Machine Translation Bangla Spelling Error Correction Dialogue Generation Translation

Abstract

Neural machine translation is a recently proposed approach to machine translation. Unlike the traditional statistical machine translation, the neural machine translation aims at building a single neural network that can be jointly tuned to maximize the translation performance. The models proposed recently for neural machine translation often belong to a family of encoder-decoders and consists of an encoder that encodes a source sentence into a fixed-length vector from which a decoder generates a translation. In this paper, we conjecture that the use of a fixed-length vector is a bottleneck in improving the performance of this basic encoder-decoder architecture, and propose to extend this by allowing a model to automatically (soft-)search for parts of a source sentence that are relevant to predicting a target word, without having to form these parts as a hard segment explicitly. With this new approach, we achieve a translation performance comparable to the existing state-of-the-art phrase-based system on the task of English-to-French translation. Furthermore, qualitative analysis reveals that the (soft-)alignments found by the model agree well with our intuition.

Results

Task	Dataset	Metric	Value	Model
Dialogue	Persona-Chat	Avg F1	16.18	Seq2Seq + Attention
Machine Translation	IWSLT2015 German-English	BLEU score	28.53	Bi-GRU (MLE+SLE)
Machine Translation	WMT2014 English-French	BLEU score	36.2	RNN-search50*
Text Generation	Persona-Chat	Avg F1	16.18	Seq2Seq + Attention
Text Generation	DPCSpell-Bangla-SEC-Corpus	Exact Match Accuracy	75.56	GRUSeq2Seq
Chatbot	Persona-Chat	Avg F1	16.18	Seq2Seq + Attention
Handwriting Verification	DPCSpell-Bangla-SEC-Corpus	Exact Match Accuracy	75.56	GRUSeq2Seq
Dialogue Generation	Persona-Chat	Avg F1	16.18	Seq2Seq + Attention
Spelling Correction	DPCSpell-Bangla-SEC-Corpus	Exact Match Accuracy	75.56	GRUSeq2Seq

Neural Machine Translation by Jointly Learning to Align and Translate

Abstract

Results

Related Papers

Neural Machine Translation by Jointly Learning to Align and Translate

Abstract

Results

Related Papers