TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/EventNarrative: A large-scale Event-centric Dataset for Kn...

EventNarrative: A large-scale Event-centric Dataset for Knowledge Graph-to-Text Generation

Anthony Colas, Ali Sadeghian, Yue Wang, Daisy Zhe Wang

2021-10-30KG-to-Text GenerationKnowledge GraphsText GenerationWorld Knowledge
PaperPDFCode(official)

Abstract

We introduce EventNarrative, a knowledge graph-to-text dataset from publicly available open-world knowledge graphs. Given the recent advances in event-driven Information Extraction (IE), and that prior research on graph-to-text only focused on entity-driven KGs, this paper focuses on event-centric data. However, our data generation system can still be adapted to other other types of KG data. Existing large-scale datasets in the graph-to-text area are non-parallel, meaning there is a large disconnect between the KGs and text. The datasets that have a paired KG and text, are small scale and manually generated or generated without a rich ontology, making the corresponding graphs sparse. Furthermore, these datasets contain many unlinked entities between their KG and text pairs. EventNarrative consists of approximately 230,000 graphs and their corresponding natural language text, 6 times larger than the current largest parallel dataset. It makes use of a rich ontology, all of the KGs entities are linked to the text, and our manual annotations confirm a high data quality. Our aim is two-fold: help break new ground in event-centric research where data is lacking, and to give researchers a well-defined, large-scale dataset in order to better evaluate existing and future knowledge graph-to-text models. We also evaluate two types of baseline on EventNarrative: a graph-to-text specific model and two state-of-the-art language models, which previous work has shown to be adaptable to the knowledge graph-to-text domain.

Results

TaskDatasetMetricValueModel
Text GenerationEventNarrativeCIDEr3.31BART
Text GenerationEventNarrativeChrF++64.71BART
Text GenerationEventNarrativeBLEU30.78GraphWriter
Text GenerationEventNarrativeBertScore92.12GraphWriter
Text GenerationEventNarrativeCIDEr4.59GraphWriter
Text GenerationEventNarrativeChrF++47.91GraphWriter
Text GenerationEventNarrativeMETEOR27.72GraphWriter
Text GenerationEventNarrativeROUGE71.92GraphWriter
Text GenerationEventNarrativeCIDEr3T5
Text GenerationEventNarrativeChrF++56.76T5
Data-to-Text GenerationEventNarrativeCIDEr3.31BART
Data-to-Text GenerationEventNarrativeChrF++64.71BART
Data-to-Text GenerationEventNarrativeBLEU30.78GraphWriter
Data-to-Text GenerationEventNarrativeBertScore92.12GraphWriter
Data-to-Text GenerationEventNarrativeCIDEr4.59GraphWriter
Data-to-Text GenerationEventNarrativeChrF++47.91GraphWriter
Data-to-Text GenerationEventNarrativeMETEOR27.72GraphWriter
Data-to-Text GenerationEventNarrativeROUGE71.92GraphWriter
Data-to-Text GenerationEventNarrativeCIDEr3T5
Data-to-Text GenerationEventNarrativeChrF++56.76T5
KG-to-Text GenerationEventNarrativeCIDEr3.31BART
KG-to-Text GenerationEventNarrativeChrF++64.71BART
KG-to-Text GenerationEventNarrativeBLEU30.78GraphWriter
KG-to-Text GenerationEventNarrativeBertScore92.12GraphWriter
KG-to-Text GenerationEventNarrativeCIDEr4.59GraphWriter
KG-to-Text GenerationEventNarrativeChrF++47.91GraphWriter
KG-to-Text GenerationEventNarrativeMETEOR27.72GraphWriter
KG-to-Text GenerationEventNarrativeROUGE71.92GraphWriter
KG-to-Text GenerationEventNarrativeCIDEr3T5
KG-to-Text GenerationEventNarrativeChrF++56.76T5

Related Papers

SMART: Relation-Aware Learning of Geometric Representations for Knowledge Graphs2025-07-17Making Language Model a Hierarchical Classifier and Generator2025-07-17HRSeg: High-Resolution Visual Perception and Enhancement for Reasoning Segmentation2025-07-17Comparing Apples to Oranges: A Dataset & Analysis of LLM Humour Understanding from Traditional Puns to Topical Jokes2025-07-17Mitigating Object Hallucinations via Sentence-Level Early Intervention2025-07-16The Devil behind the mask: An emergent safety vulnerability of Diffusion LLMs2025-07-15Seq vs Seq: An Open Suite of Paired Encoders and Decoders2025-07-15Hashed Watermark as a Filter: Defeating Forging and Overwriting Attacks in Weight-based Neural Network Watermarking2025-07-15