Deunsol Yoon, Dongbok Lee, SangKeun Lee
In this paper, we propose Dynamic Self-Attention (DSA), a new self-attention mechanism for sentence embedding. We design DSA by modifying dynamic routing in capsule network (Sabouretal.,2017) for natural language processing. DSA attends to informative words with a dynamic weight vector. We achieve new state-of-the-art results among sentence encoding methods in Stanford Natural Language Inference (SNLI) dataset with the least number of parameters, while showing comparative results in Stanford Sentiment Treebank (SST) dataset.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Natural Language Inference | SNLI | % Test Accuracy | 87.4 | 2400D Multiple-Dynamic Self-Attention Model |
| Natural Language Inference | SNLI | % Train Accuracy | 89 | 2400D Multiple-Dynamic Self-Attention Model |
| Natural Language Inference | SNLI | % Test Accuracy | 86.8 | 600D Dynamic Self-Attention Model |
| Natural Language Inference | SNLI | % Train Accuracy | 87.3 | 600D Dynamic Self-Attention Model |