TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Multimodal Differential Network for Visual Question Genera...

Multimodal Differential Network for Visual Question Generation

Badri N. Patro, Sandeep Kumar, Vinod K. Kurmi, Vinay P. Namboodiri

2018-08-12EMNLP 2018 10Natural QuestionsQuestion Generation
PaperPDFCode

Abstract

Generating natural questions from an image is a semantic task that requires using visual and language modality to learn multimodal representations. Images can have multiple visual and language contexts that are relevant for generating questions namely places, captions, and tags. In this paper, we propose the use of exemplars for obtaining the relevant context. We obtain this by using a Multimodal Differential Network to produce natural and engaging questions. The generated questions show a remarkable similarity to the natural questions as validated by a human study. Further, we observe that the proposed approach substantially improves over state-of-the-art benchmarks on the quantitative metrics (BLEU, METEOR, ROUGE, and CIDEr).

Results

TaskDatasetMetricValueModel
Question GenerationVisual Question GenerationBLEU-136MDN
Question GenerationCOCO Visual Question Answering (VQA) real images 1.0 open endedBLEU-165.1MDN

Related Papers

Compressed and Smooth Latent Space for Text Diffusion Modeling2025-06-26ELLIS Alicante at CQs-Gen 2025: Winning the critical thinking questions shared task: LLM-based question generation and selection2025-06-17Constructing and Evaluating Declarative RAG Pipelines in PyTerrier2025-06-12Knowledge Compression via Question Generation: Enhancing Multihop Document Retrieval without Fine-tuning2025-06-09Multiple-Choice Question Generation Using Large Language Models: Methodology and Educator Insights2025-06-05Retrieval-Augmented Generation as Noisy In-Context Learning: A Unified Theory and Risk Bounds2025-06-03TO-GATE: Clarifying Questions and Summarizing Responses with Trajectory Optimization for Eliciting Human Preference2025-06-03Bench4KE: Benchmarking Automated Competency Question Generation2025-05-30