TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Semantic Noise Matters for Neural Natural Language Generat...

Semantic Noise Matters for Neural Natural Language Generation

Ondřej Dušek, David M. Howcroft, Verena Rieser

2019-11-10WS 2019 10Data-to-Text GenerationText GenerationHallucination
PaperPDFCode(official)

Abstract

Neural natural language generation (NNLG) systems are known for their pathological outputs, i.e. generating text which is unrelated to the input specification. In this paper, we show the impact of semantic noise on state-of-the-art NNLG models which implement different semantic control mechanisms. We find that cleaned data can improve semantic correctness by up to 97%, while maintaining fluency. We also find that the most common error is omitting information, rather than hallucination.

Results

TaskDatasetMetricValueModel
Text GenerationCleaned E2E NLG ChallengeBLEU (Test set)40.73TGen
Data-to-Text GenerationCleaned E2E NLG ChallengeBLEU (Test set)40.73TGen

Related Papers

Making Language Model a Hierarchical Classifier and Generator2025-07-17Mitigating Object Hallucinations via Sentence-Level Early Intervention2025-07-16The Devil behind the mask: An emergent safety vulnerability of Diffusion LLMs2025-07-15Seq vs Seq: An Open Suite of Paired Encoders and Decoders2025-07-15Hashed Watermark as a Filter: Defeating Forging and Overwriting Attacks in Weight-based Neural Network Watermarking2025-07-15Exploiting Leaderboards for Large-Scale Distribution of Malicious Models2025-07-11ByDeWay: Boost Your multimodal LLM with DEpth prompting in a Training-Free Way2025-07-11CLI-RAG: A Retrieval-Augmented Framework for Clinically Structured and Context Aware Text Generation with LLMs2025-07-09