Grounded Textual Entailment

Hoa Trong Vu, Claudio Greco, Aliia Erofeeva, Somayeh Jafaritazehjan, Guido Linders, Marc Tanti, Alberto Testoni, Raffaella Bernardi, Albert Gatt

2018-06-14COLING 2018 8Natural Language Inference

Paper PDF Code(official)

Abstract

Capturing semantic relations between sentences, such as entailment, is a long-standing challenge for computational semantics. Logic-based models analyse entailment in terms of possible worlds (interpretations, or situations) where a premise P entails a hypothesis H iff in all worlds where P is true, H is also true. Statistical models view this relationship probabilistically, addressing it in terms of whether a human would likely infer H from P. In this paper, we wish to bridge these two perspectives, by arguing for a visually-grounded version of the Textual Entailment task. Specifically, we ask whether models can perform better if, in addition to P and H, there is also an image (corresponding to the relevant "world" or "situation"). We use a multimodal version of the SNLI dataset (Bowman et al., 2015) and we compare "blind" and visually-augmented models of textual entailment. We show that visual information is beneficial, but we also conduct an in-depth error analysis that reveals that current multimodal models are not performing "grounding" in an optimal fashion.

Results

Task	Dataset	Metric	Value	Model
Natural Language Inference	V-SNLI	Accuracy	86.99	V-BiMPM
Natural Language Inference	V-SNLI	Accuracy	86.41	BiMPM

Related Papers

LRCTI: A Large Language Model-Based Framework for Multi-Step Evidence Retrieval and Reasoning in Cyber Threat Intelligence Credibility Verification2025-07-15 DS@GT at CheckThat! 2025: Evaluating Context and Tokenization Strategies for Numerical Fact Verification2025-07-08 ARAG: Agentic Retrieval Augmented Generation for Personalized Recommendation2025-06-27 Thunder-NUBench: A Benchmark for LLMs' Sentence-Level Negation Understanding2025-06-17 When Does Meaning Backfire? Investigating the Role of AMRs in NLI2025-06-17 Explainable Compliance Detection with Multi-Hop Natural Language Inference on Assurance Case Structure2025-06-10 Theorem-of-Thought: A Multi-Agent Framework for Abductive, Deductive, and Inductive Reasoning in Language Models2025-06-08 A MISMATCHED Benchmark for Scientific Natural Language Inference2025-06-05