TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/KGPT: Knowledge-Grounded Pre-Training for Data-to-Text Gen...

KGPT: Knowledge-Grounded Pre-Training for Data-to-Text Generation

Wenhu Chen, Yu Su, Xifeng Yan, William Yang Wang

2020-10-05EMNLP 2020 11KG-to-Text GenerationData-to-Text GenerationText GenerationTransfer LearningGeneral Knowledge
PaperPDFCode(official)

Abstract

Data-to-text generation has recently attracted substantial interests due to its wide applications. Existing methods have shown impressive performance on an array of tasks. However, they rely on a significant amount of labeled data for each task, which is costly to acquire and thus limits their application to new tasks and domains. In this paper, we propose to leverage pre-training and transfer learning to address this issue. We propose a knowledge-grounded pre-training (KGPT), which consists of two parts, 1) a general knowledge-grounded generation model to generate knowledge-enriched text. 2) a pre-training paradigm on a massive knowledge-grounded text corpus crawled from the web. The pre-trained model can be fine-tuned on various data-to-text generation tasks to generate task-specific text. We adopt three settings, namely fully-supervised, zero-shot, few-shot to evaluate its effectiveness. Under the fully-supervised setting, our model can achieve remarkable gains over the known baselines. Under zero-shot setting, our model without seeing any examples achieves over 30 ROUGE-L on WebNLG while all other baselines fail. Under the few-shot setting, our model only needs about one-fifteenth as many labeled examples to achieve the same level of performance as baseline models. These experiments consistently prove the strong generalization ability of our proposed framework https://github.com/wenhuchen/KGPT.

Results

TaskDatasetMetricValueModel
Text GenerationWebNLG 2.0 (Unconstrained)BLEU64.11KGPT
Text GenerationWebNLG 2.0 (Unconstrained)METEOR46.3KGPT
Text GenerationWebNLG 2.0 (Unconstrained)ROUGE74.57KGPT
Data-to-Text GenerationWebNLG 2.0 (Unconstrained)BLEU64.11KGPT
Data-to-Text GenerationWebNLG 2.0 (Unconstrained)METEOR46.3KGPT
Data-to-Text GenerationWebNLG 2.0 (Unconstrained)ROUGE74.57KGPT
KG-to-Text GenerationWebNLG 2.0 (Unconstrained)BLEU64.11KGPT
KG-to-Text GenerationWebNLG 2.0 (Unconstrained)METEOR46.3KGPT
KG-to-Text GenerationWebNLG 2.0 (Unconstrained)ROUGE74.57KGPT

Related Papers

RaMen: Multi-Strategy Multi-Modal Learning for Bundle Construction2025-07-18Making Language Model a Hierarchical Classifier and Generator2025-07-17Disentangling coincident cell events using deep transfer learning and compressive sensing2025-07-17Mitigating Object Hallucinations via Sentence-Level Early Intervention2025-07-16Best Practices for Large-Scale, Pixel-Wise Crop Mapping and Transfer Learning Workflows2025-07-16PROL : Rehearsal Free Continual Learning in Streaming Data via Prompt Online Learning2025-07-16The Devil behind the mask: An emergent safety vulnerability of Diffusion LLMs2025-07-15Seq vs Seq: An Open Suite of Paired Encoders and Decoders2025-07-15