Semi-Amortized Variational Autoencoders

Yoon Kim, Sam Wiseman, Andrew C. Miller, David Sontag, Alexander M. Rush

2018-02-07ICML 2018 7Text Generation Variational Inference

Abstract

Amortized variational inference (AVI) replaces instance-specific local inference with a global inference network. While AVI has enabled efficient training of deep generative models such as variational autoencoders (VAE), recent empirical work suggests that inference networks can produce suboptimal variational parameters. We propose a hybrid approach, to use AVI to initialize the variational parameters and run stochastic variational inference (SVI) to refine them. Crucially, the local SVI procedure is itself differentiable, so the inference network and generative model can be trained end-to-end with gradient-based optimization. This semi-amortized approach enables the use of rich generative models without experiencing the posterior-collapse phenomenon common in training VAEs for problems like text generation. Experiments show this approach outperforms strong autoregressive and variational baselines on standard text and image datasets.

Results

Task	Dataset	Metric	Value	Model
Text Generation	Yahoo Questions	KL	7.19	SA-VAE
Text Generation	Yahoo Questions	NLL	327.5	SA-VAE
Text Generation	Yahoo Questions	Perplexity	60.4	SA-VAE

Related Papers

Making Language Model a Hierarchical Classifier and Generator2025-07-17 Mitigating Object Hallucinations via Sentence-Level Early Intervention2025-07-16 The Devil behind the mask: An emergent safety vulnerability of Diffusion LLMs2025-07-15 Seq vs Seq: An Open Suite of Paired Encoders and Decoders2025-07-15 Hashed Watermark as a Filter: Defeating Forging and Overwriting Attacks in Weight-based Neural Network Watermarking2025-07-15 Interpretable Bayesian Tensor Network Kernel Machines with Automatic Rank and Feature Selection2025-07-15 Exploiting Leaderboards for Large-Scale Distribution of Malicious Models2025-07-11 CLI-RAG: A Retrieval-Augmented Framework for Clinically Structured and Context Aware Text Generation with LLMs2025-07-09