Mengzuo Huang, Feng Li, Wuhe Zou, Weidong Zhang
Dialogue systems in open domain have achieved great success due to the easily obtained single-turn corpus and the development of deep learning, but the multi-turn scenario is still a challenge because of the frequent coreference and information omission. In this paper, we investigate the incomplete utterance restoration which has brought general improvement over multi-turn dialogue systems in recent studies. Meanwhile, jointly inspired by the autoregression for text generation and the sequence labeling for text editing, we propose a novel semi autoregressive generator (SARG) with the high efficiency and flexibility. Moreover, experiments on two benchmarks show that our proposed model significantly outperforms the state-of-the-art models in terms of quality and inference speed.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Dialogue Rewriting | Multi-Rewrite | Rewriting F2 | 52.5 | SARG (n_beam=5) |
| Dialogue Rewriting | Multi-Rewrite | Rewriting F3 | 46.4 | SARG (n_beam=5) |
| Dialogue Rewriting | Multi-Rewrite | BLEU-1 | 92.2 | SARG (greedy) |
| Dialogue Rewriting | Multi-Rewrite | BLEU-2 | 89.6 | SARG (greedy) |
| Dialogue Rewriting | Multi-Rewrite | ROUGE-1 | 92.1 | SARG (greedy) |
| Dialogue Rewriting | Multi-Rewrite | ROUGE-2 | 86 | SARG (greedy) |
| Dialogue Rewriting | Multi-Rewrite | Rewriting F1 | 62.4 | SARG (greedy) |
| Dialogue Rewriting | CANARD | BLEU | 54.8 | SARG |