MiniCheck: Efficient Fact-Checking of LLMs on Grounding Documents

Liyan Tang, Philippe Laban, Greg Durrett

2024-04-16Fact Checking

Abstract

Recognizing if LLM output can be grounded in evidence is central to many tasks in NLP: retrieval-augmented generation, summarization, document-grounded dialogue, and more. Current approaches to this kind of fact-checking are based on verifying each piece of a model generation against potential evidence using an LLM. However, this process can be very computationally expensive, requiring many calls to a model to check a single response. In this work, we show how to build small fact-checking models that have GPT-4-level performance but for 400x lower cost. We do this by constructing synthetic training data with GPT-4, which involves creating realistic yet challenging instances of factual errors via a structured generation procedure. Training on this data teaches models to check each fact in the claim and recognize synthesis of information across sentences. For evaluation, we unify datasets from recent work on fact-checking and grounding LLM generations into a new benchmark, LLM-AggreFact. Our best system MiniCheck-FT5 (770M parameters) outperforms all systems of comparable size and reaches GPT-4 accuracy. We release LLM-AggreFact, code for data synthesis, and models.

Results

Task	Dataset	Metric	Value	Model
Fact Checking	^(#$!@#$)(()))******	0..5sec	2	Abc

Related Papers

PiMRef: Detecting and Explaining Ever-evolving Spear Phishing Emails with Knowledge Base Invariants2025-07-21 DS@GT at CheckThat! 2025: Evaluating Context and Tokenization Strategies for Numerical Fact Verification2025-07-08 Recon, Answer, Verify: Agents in Search of Truth2025-07-04 The Next Phase of Scientific Fact-Checking: Advanced Evidence Retrieval from Complex Structured Academic Papers2025-06-25 Decide less, communicate more: On the construct validity of end-to-end fact-checking in medicine2025-06-25 Veracity: An Open-Source AI Fact-Checking System2025-06-18 Verifying the Verifiers: Unveiling Pitfalls and Potentials in Fact Verifiers2025-06-16 SPOT: Bridging Natural Language and Geospatial Search for Investigative Journalists2025-06-16