TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Datasets/SME

SME

Standard Multimodal Explanation

ImagesTextsApache-2.0Introduced 2024-10-28

SME is a new dataset for Multi-modal Explanation for Visual Question Answering comprising 1,028,230 samples, with 1,656 visual objects requiring detection in explanations. To our knowledge, this is the first dataset where the explanations are in standard English with additional visual grounding tokens.

Benchmarks

Explanatory Visual Question Answering/BLEU-4Explanatory Visual Question Answering/METEORExplanatory Visual Question Answering/ROUGE-LExplanatory Visual Question Answering/CIDErExplanatory Visual Question Answering/SPICEExplanatory Visual Question Answering/DetectionExplanatory Visual Question Answering/ACCExplanatory Visual Question Answering/#Learning Samples (N)Visual Question Answering/BLEU-4Visual Question Answering/METEORVisual Question Answering/ROUGE-LVisual Question Answering/CIDErVisual Question Answering/SPICEVisual Question Answering/DetectionVisual Question Answering/ACCVisual Question Answering/#Learning Samples (N)Visual Question Answering (VQA)/BLEU-4Visual Question Answering (VQA)/METEORVisual Question Answering (VQA)/ROUGE-LVisual Question Answering (VQA)/CIDErVisual Question Answering (VQA)/SPICEVisual Question Answering (VQA)/DetectionVisual Question Answering (VQA)/ACCVisual Question Answering (VQA)/#Learning Samples (N)

Statistics

Papers
7
Benchmarks
24

Links

Homepage

Tasks

Explanatory Visual Question AnsweringFS-MEVQAVisual Question AnsweringVisual Question Answering (VQA)