Tom Hosking, Hao Tang, Mirella Lapata
We propose a generative model of paraphrase generation, that encourages syntactic diversity by conditioning on an explicit syntactic sketch. We introduce Hierarchical Refinement Quantized Variational Autoencoders (HRQ-VAE), a method for learning decompositions of dense encodings as a sequence of discrete latent variables that make iterative refinements of increasing granularity. This hierarchy of codes is learned through end-to-end training, and represents fine-to-coarse grained information about the input. We use HRQ-VAE to encode the syntactic form of an input sentence as a path through the hierarchy, allowing us to more easily predict syntactic sketches at test time. Extensive experiments, including a human evaluation, confirm that HRQ-VAE learns a hierarchical representation of the input space, and generates paraphrases of higher quality than previous systems.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Text Generation | Quora Question Pairs | BLEU | 33.11 | HRQ-VAE |
| Text Generation | Quora Question Pairs | iBLEU | 18.42 | HRQ-VAE |
| Text Generation | Paralex | BLEU | 39.49 | HRQ-VAE |
| Text Generation | Paralex | iBLEU | 24.93 | HRQ-VAE |
| Text Generation | MSCOCO | BLEU | 27.9 | HRQ-VAE |
| Text Generation | MSCOCO | iBLEU | 19.04 | HRQ-VAE |
| Paraphrase Generation | Quora Question Pairs | BLEU | 33.11 | HRQ-VAE |
| Paraphrase Generation | Quora Question Pairs | iBLEU | 18.42 | HRQ-VAE |
| Paraphrase Generation | Paralex | BLEU | 39.49 | HRQ-VAE |
| Paraphrase Generation | Paralex | iBLEU | 24.93 | HRQ-VAE |
| Paraphrase Generation | MSCOCO | BLEU | 27.9 | HRQ-VAE |
| Paraphrase Generation | MSCOCO | iBLEU | 19.04 | HRQ-VAE |