Papers With Code 2 | ML Benchmarks, SotA Results & Code

Factual Inconsistency Benchmark (FIB) is a benchmark that focuses on the task of summarization. Specifically, the benchmark involves comparing the scores an LLM assigns to a factually consistent versus a factual inconsistent summary for an input news article. For factually consistent summaries, human-written reference summaries are used to manually verify as factually consistent.