TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Findings of the E2E NLG Challenge

Findings of the E2E NLG Challenge

Ondřej Dušek, Jekaterina Novikova, Verena Rieser

2018-10-02WS 2018 11Data-to-Text GenerationText GenerationSpoken Dialogue Systems
PaperPDFCode

Abstract

This paper summarises the experimental setup and results of the first shared task on end-to-end (E2E) natural language generation (NLG) in spoken dialogue systems. Recent end-to-end generation systems are promising since they reduce the need for data annotation. However, they are currently limited to small, delexicalised datasets. The E2E NLG shared task aims to assess whether these novel approaches can generate better-quality output by learning from a dataset containing higher lexical richness, syntactic complexity and diverse discourse phenomena. We compare 62 systems submitted by 17 institutions, covering a wide range of approaches, including machine learning architectures -- with the majority implementing sequence-to-sequence models (seq2seq) -- as well as systems based on grammatical rules and templates.

Results

TaskDatasetMetricValueModel
Text GenerationE2E NLG ChallengeBLEU65.93TGen
Text GenerationE2E NLG ChallengeCIDEr2.2338TGen
Text GenerationE2E NLG ChallengeMETEOR44.83TGen
Text GenerationE2E NLG ChallengeNIST8.6094TGen
Text GenerationE2E NLG ChallengeROUGE-L68.5TGen
Data-to-Text GenerationE2E NLG ChallengeBLEU65.93TGen
Data-to-Text GenerationE2E NLG ChallengeCIDEr2.2338TGen
Data-to-Text GenerationE2E NLG ChallengeMETEOR44.83TGen
Data-to-Text GenerationE2E NLG ChallengeNIST8.6094TGen
Data-to-Text GenerationE2E NLG ChallengeROUGE-L68.5TGen

Related Papers

Making Language Model a Hierarchical Classifier and Generator2025-07-17Mitigating Object Hallucinations via Sentence-Level Early Intervention2025-07-16The Devil behind the mask: An emergent safety vulnerability of Diffusion LLMs2025-07-15Seq vs Seq: An Open Suite of Paired Encoders and Decoders2025-07-15Hashed Watermark as a Filter: Defeating Forging and Overwriting Attacks in Weight-based Neural Network Watermarking2025-07-15Exploiting Leaderboards for Large-Scale Distribution of Malicious Models2025-07-11CLI-RAG: A Retrieval-Augmented Framework for Clinically Structured and Context Aware Text Generation with LLMs2025-07-09FIFA: Unified Faithfulness Evaluation Framework for Text-to-Video and Video-to-Text Generation2025-07-09