Xinyi Wang, Jason Weston, Michael Auli, Yacine Jernite
Neural sequence to sequence models are well established for applications which can be cast as mapping a single input sequence into a single output sequence. In this work, we focus on cases where generation is conditioned on both a short query and a long context, such as abstractive question answering or document-level translation. We modify the standard sequence-to-sequence approach to make better use of both the query and the context by expanding the conditioning mechanism to intertwine query and context attention. We also introduce a simple and efficient data augmentation method for the proposed model. Experiments on three different tasks show that both changes lead to consistent improvements.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Question Answering | ELI5 | Rouge-1 | 23.32 | Multi-Inrerleave |
| Question Answering | ELI5 | Rouge-2 | 4.79 | Multi-Inrerleave |
| Question Answering | ELI5 | Rouge-L | 14.63 | Multi-Inrerleave |
| Open-Domain Question Answering | ELI5 | Rouge-1 | 23.32 | Multi-Inrerleave |
| Open-Domain Question Answering | ELI5 | Rouge-2 | 4.79 | Multi-Inrerleave |
| Open-Domain Question Answering | ELI5 | Rouge-L | 14.63 | Multi-Inrerleave |