Robust Summarization Evaluation Benchmark
TextsIntroduced 2022-12-15
Robust Summarization Evaluation Benchmark is a large human evaluation dataset consisting of over 22k summary-level annotations over state-of-the-art systems on three datasets.
Source: Revisiting the Gold Standard: Grounding Summarization Evaluation with Robust Human Evaluation