TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/PALM: Pre-training an Autoencoding&Autoregressive Language...

PALM: Pre-training an Autoencoding&Autoregressive Language Model for Context-conditioned Generation

Bin Bi, Chenliang Li, Chen Wu, Ming Yan, Wei Wang, Songfang Huang, Fei Huang, Luo Si

2020-04-14DenoisingQuestion AnsweringText GenerationAbstractive Text SummarizationText SummarizationGenerative Question AnsweringNatural Language UnderstandingConversational Response GenerationQuestion GenerationLanguage ModellingResponse Generation
PaperPDFCodeCode(official)

Abstract

Self-supervised pre-training, such as BERT, MASS and BART, has emerged as a powerful technique for natural language understanding and generation. Existing pre-training techniques employ autoencoding and/or autoregressive objectives to train Transformer-based models by recovering original word tokens from corrupted text with some masked tokens. The training goals of existing techniques are often inconsistent with the goals of many language generation tasks, such as generative question answering and conversational response generation, for producing new text given context. This work presents PALM with a novel scheme that jointly pre-trains an autoencoding and autoregressive language model on a large unlabeled corpus, specifically designed for generating new text conditioned on context. The new scheme alleviates the mismatch introduced by the existing denoising scheme between pre-training and fine-tuning where generation is more than reconstructing original text. An extensive set of experiments show that PALM achieves new state-of-the-art results on a variety of language generation benchmarks covering generative question answering (Rank 1 on the official MARCO leaderboard), abstractive summarization on CNN/DailyMail as well as Gigaword, question generation on SQuAD, and conversational response generation on Cornell Movie Dialogues.

Results

TaskDatasetMetricValueModel
Text GenerationCNN/Daily MailROUGE-L41.41PALM
Text SummarizationGigaWordROUGE-139.45PALM
Text SummarizationGigaWordROUGE-220.37PALM
Text SummarizationGigaWordROUGE-L36.75PALM
Text SummarizationCNN / Daily MailROUGE-144.3PALM
Text SummarizationCNN / Daily MailROUGE-221.12PALM
Text SummarizationCNN / Daily MailROUGE-L41.41PALM
Abstractive Text SummarizationCNN / Daily MailROUGE-144.3PALM
Abstractive Text SummarizationCNN / Daily MailROUGE-221.12PALM
Abstractive Text SummarizationCNN / Daily MailROUGE-L41.41PALM

Related Papers

Visual-Language Model Knowledge Distillation Method for Image Quality Assessment2025-07-21fastWDM3D: Fast and Accurate 3D Healthy Tissue Inpainting2025-07-17Diffuman4D: 4D Consistent Human View Synthesis from Sparse-View Videos with Spatio-Temporal Diffusion Models2025-07-17From Roots to Rewards: Dynamic Tree Reasoning with RL2025-07-17Enter the Mind Palace: Reasoning and Planning for Long-term Active Embodied Question Answering2025-07-17Vision-and-Language Training Helps Deploy Taxonomic Knowledge but Does Not Fundamentally Alter It2025-07-17City-VLM: Towards Multidomain Perception Scene Understanding via Multimodal Incomplete Learning2025-07-17Making Language Model a Hierarchical Classifier and Generator2025-07-17