Dongling Xiao, Han Zhang, Yukun Li, Yu Sun, Hao Tian, Hua Wu, Haifeng Wang
Current pre-training works in natural language generation pay little attention to the problem of exposure bias on downstream tasks. To address this issue, we propose an enhanced multi-flow sequence to sequence pre-training and fine-tuning framework named ERNIE-GEN, which bridges the discrepancy between training and inference with an infilling generation mechanism and a noise-aware generation method. To make generation closer to human writing patterns, this framework introduces a span-by-span generation flow that trains the model to predict semantically-complete spans consecutively rather than predicting word by word. Unlike existing pre-training methods, ERNIE-GEN incorporates multi-granularity target sampling to construct pre-training data, which enhances the correlation between encoder and decoder. Experimental results demonstrate that ERNIE-GEN achieves state-of-the-art results with a much smaller amount of pre-training data and parameters on a range of language generation tasks, including abstractive summarization (Gigaword and CNN/DailyMail), question generation (SQuAD), dialogue generation (Persona-Chat) and generative question answering (CoQA).
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Question Answering | CoQA | F1-Score | 84.5 | ERNIE-GEN |
| Text Summarization | GigaWord | ROUGE-1 | 39.46 | ERNIE-GENLARGE (large-scale text corpora) |
| Text Summarization | GigaWord | ROUGE-2 | 20.34 | ERNIE-GENLARGE (large-scale text corpora) |
| Text Summarization | GigaWord | ROUGE-L | 36.74 | ERNIE-GENLARGE (large-scale text corpora) |
| Text Summarization | GigaWord | ROUGE-1 | 39.25 | ERNIE-GENLARGE |
| Text Summarization | GigaWord | ROUGE-2 | 20.25 | ERNIE-GENLARGE |
| Text Summarization | GigaWord | ROUGE-L | 36.53 | ERNIE-GENLARGE |
| Text Summarization | GigaWord | ROUGE-1 | 38.83 | ERNIE-GENBASE |
| Text Summarization | GigaWord | ROUGE-2 | 20.04 | ERNIE-GENBASE |
| Text Summarization | GigaWord | ROUGE-L | 36.2 | ERNIE-GENBASE |
| Text Summarization | GigaWord-10k | ROUGE-1 | 35.51 | ERNIE-GENLARGE (large-scale text corpora) |
| Text Summarization | GigaWord-10k | ROUGE-2 | 16.79 | ERNIE-GENLARGE (large-scale text corpora) |
| Text Summarization | GigaWord-10k | ROUGE-L | 33.23 | ERNIE-GENLARGE (large-scale text corpora) |
| Text Summarization | GigaWord-10k | ROUGE-1 | 35.05 | ERNIE-GENLARGE |
| Text Summarization | GigaWord-10k | ROUGE-2 | 16.1 | ERNIE-GENLARGE |
| Text Summarization | GigaWord-10k | ROUGE-L | 32.5 | ERNIE-GENLARGE |
| Text Summarization | GigaWord-10k | ROUGE-1 | 33.75 | ERNIE-GENBASE |
| Text Summarization | GigaWord-10k | ROUGE-2 | 15.23 | ERNIE-GENBASE |
| Text Summarization | GigaWord-10k | ROUGE-L | 31.35 | ERNIE-GENBASE |
| Text Summarization | CNN / Daily Mail | ROUGE-1 | 44.31 | ERNIE-GENLARGE (large-scale text corpora) |
| Text Summarization | CNN / Daily Mail | ROUGE-2 | 21.35 | ERNIE-GENLARGE (large-scale text corpora) |
| Text Summarization | CNN / Daily Mail | ROUGE-L | 41.6 | ERNIE-GENLARGE (large-scale text corpora) |
| Text Summarization | CNN / Daily Mail | ROUGE-1 | 44.02 | ERNIE-GENLARGE |
| Text Summarization | CNN / Daily Mail | ROUGE-2 | 21.17 | ERNIE-GENLARGE |
| Text Summarization | CNN / Daily Mail | ROUGE-L | 41.26 | ERNIE-GENLARGE |
| Text Summarization | CNN / Daily Mail | ROUGE-1 | 42.3 | ERNIE-GENBASE |
| Text Summarization | CNN / Daily Mail | ROUGE-2 | 19.92 | ERNIE-GENBASE |
| Text Summarization | CNN / Daily Mail | ROUGE-L | 39.68 | ERNIE-GENBASE |
| Abstractive Text Summarization | CNN / Daily Mail | ROUGE-1 | 44.31 | ERNIE-GENLARGE (large-scale text corpora) |
| Abstractive Text Summarization | CNN / Daily Mail | ROUGE-2 | 21.35 | ERNIE-GENLARGE (large-scale text corpora) |
| Abstractive Text Summarization | CNN / Daily Mail | ROUGE-L | 41.6 | ERNIE-GENLARGE (large-scale text corpora) |
| Abstractive Text Summarization | CNN / Daily Mail | ROUGE-1 | 44.02 | ERNIE-GENLARGE |
| Abstractive Text Summarization | CNN / Daily Mail | ROUGE-2 | 21.17 | ERNIE-GENLARGE |
| Abstractive Text Summarization | CNN / Daily Mail | ROUGE-L | 41.26 | ERNIE-GENLARGE |
| Abstractive Text Summarization | CNN / Daily Mail | ROUGE-1 | 42.3 | ERNIE-GENBASE |
| Abstractive Text Summarization | CNN / Daily Mail | ROUGE-2 | 19.92 | ERNIE-GENBASE |
| Abstractive Text Summarization | CNN / Daily Mail | ROUGE-L | 39.68 | ERNIE-GENBASE |
| Question Generation | SQuAD1.1 | BLEU-4 | 25.41 | ERNIE-GENLARGE (beam size=5) |