VQA-E

ImagesTextsMITIntroduced 2018-03-20

VQA-E is a dataset for Visual Question Answering with Explanation, where the models are required to generate and explanation with the predicted answer. The VQA-E dataset is automatically derived from the VQA v2 dataset by synthesizing a textual explanation for each image-question-answer triple.

Image Source: VQA-E: Explaining, Elaborating, and Enhancing Your Answers for Visual Questions