TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/PASTA: Table-Operations Aware Fact Verification via Senten...

PASTA: Table-Operations Aware Fact Verification via Sentence-Table Cloze Pre-training

Zihui Gu, Ju Fan, Nan Tang, Preslav Nakov, Xiaoman Zhao, Xiaoyong Du

2022-11-05MarketingTable-based Fact VerificationMisinformationFact CheckingFact VerificationLanguage Modelling
PaperPDFCode(official)

Abstract

Fact verification has attracted a lot of research attention recently, e.g., in journalism, marketing, and policymaking, as misinformation and disinformation online can sway one's opinion and affect one's actions. While fact-checking is a hard task in general, in many cases, false statements can be easily debunked based on analytics over tables with reliable information. Hence, table-based fact verification has recently emerged as an important and growing research area. Yet, progress has been limited due to the lack of datasets that can be used to pre-train language models (LMs) to be aware of common table operations, such as aggregating a column or comparing tuples. To bridge this gap, in this paper we introduce PASTA, a novel state-of-the-art framework for table-based fact verification via pre-training with synthesized sentence-table cloze questions. In particular, we design six types of common sentence-table cloze tasks, including Filter, Aggregation, Superlative, Comparative, Ordinal, and Unique, based on which we synthesize a large corpus consisting of 1.2 million sentence-table pairs from WikiTables. PASTA uses a recent pre-trained LM, DeBERTaV3, and further pretrains it on our corpus. Our experimental results show that PASTA achieves new state-of-the-art performance on two table-based fact verification benchmarks: TabFact and SEM-TAB-FACTS. In particular, on the complex set of TabFact, which contains multiple operations, PASTA largely outperforms the previous state of the art by 4.7 points (85.6% vs. 80.9%), and the gap between PASTA and human performance on the small TabFact test set is narrowed to just 1.5 points (90.6% vs. 92.1%).

Results

TaskDatasetMetricValueModel
Table-based Fact VerificationTabFactTest89.3PASTA
Table-based Fact VerificationTabFactVal89.2PASTA

Related Papers

PiMRef: Detecting and Explaining Ever-evolving Spear Phishing Emails with Knowledge Base Invariants2025-07-21Visual-Language Model Knowledge Distillation Method for Image Quality Assessment2025-07-21SHIELD: A Secure and Highly Enhanced Integrated Learning for Robust Deepfake Detection against Adversarial Attacks2025-07-17Leveraging Pre-Trained Visual Models for AI-Generated Video Detection2025-07-17Making Language Model a Hierarchical Classifier and Generator2025-07-17VisionThink: Smart and Efficient Vision Language Model via Reinforcement Learning2025-07-17The Generative Energy Arena (GEA): Incorporating Energy Awareness in Large Language Model (LLM) Human Evaluations2025-07-17Inverse Reinforcement Learning Meets Large Language Model Post-Training: Basics, Advances, and Opportunities2025-07-17