Hierarchical Sketch Induction for Paraphrase Generation

Tom Hosking, Hao Tang, Mirella Lapata

2022-03-07ACL 2022 5Paraphrase Generation

Abstract

We propose a generative model of paraphrase generation, that encourages syntactic diversity by conditioning on an explicit syntactic sketch. We introduce Hierarchical Refinement Quantized Variational Autoencoders (HRQ-VAE), a method for learning decompositions of dense encodings as a sequence of discrete latent variables that make iterative refinements of increasing granularity. This hierarchy of codes is learned through end-to-end training, and represents fine-to-coarse grained information about the input. We use HRQ-VAE to encode the syntactic form of an input sentence as a path through the hierarchy, allowing us to more easily predict syntactic sketches at test time. Extensive experiments, including a human evaluation, confirm that HRQ-VAE learns a hierarchical representation of the input space, and generates paraphrases of higher quality than previous systems.

Results

Task	Dataset	Metric	Value	Model
Text Generation	Quora Question Pairs	BLEU	33.11	HRQ-VAE
Text Generation	Quora Question Pairs	iBLEU	18.42	HRQ-VAE
Text Generation	Paralex	BLEU	39.49	HRQ-VAE
Text Generation	Paralex	iBLEU	24.93	HRQ-VAE
Text Generation	MSCOCO	BLEU	27.9	HRQ-VAE
Text Generation	MSCOCO	iBLEU	19.04	HRQ-VAE
Paraphrase Generation	Quora Question Pairs	BLEU	33.11	HRQ-VAE
Paraphrase Generation	Quora Question Pairs	iBLEU	18.42	HRQ-VAE
Paraphrase Generation	Paralex	BLEU	39.49	HRQ-VAE
Paraphrase Generation	Paralex	iBLEU	24.93	HRQ-VAE
Paraphrase Generation	MSCOCO	BLEU	27.9	HRQ-VAE
Paraphrase Generation	MSCOCO	iBLEU	19.04	HRQ-VAE

Related Papers

Enhancing Paraphrase Type Generation: The Impact of DPO and RLHF Evaluated with Human-Ranked Data2025-05-28 A Large-Scale Benchmark for Vietnamese Sentence Paraphrases2025-02-11 SMAB: MAB based word Sensitivity Estimation Framework and its Applications in Adversarial Text Generation2025-02-10 Learning to Adapt to Low-Resource Paraphrase Generation2024-12-22 Latent Paraphrasing: Perturbation on Layers Improves Knowledge Injection in Language Models2024-11-01 Multi-Attribute Linguistic Tuning for Controlled Paraphrase Generation2024-10-31 Retrieve, Generate, Evaluate: A Case Study for Medical Paraphrases Generation with Small Language Models2024-07-23 Analyzing Persuasive Strategies in Meme Texts: A Fusion of Language Models with Paraphrase Enrichment2024-07-01