TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Datasets/GQA

GQA

ImagesTextsCC BY 4.0Introduced 2019-01-01

The GQA dataset is a large-scale visual question answering dataset with real images from the Visual Genome dataset and balanced question-answer pairs. Each training and validation image is also associated with scene graph annotations describing the classes and attributes of those objects in the scene, and their pairwise relations. Along with the images and question-answer pairs, the GQA dataset provides two types of pre-extracted visual features for each image – convolutional grid features of size 7×7×2048 extracted from a ResNet-101 network trained on ImageNet, and object detection features of size Ndet×2048 (where Ndet is the number of detected objects in each image with a maximum of 100 per image) from a Faster R-CNN detector.

Source: Language-Conditioned Graph Networks for Relational Reasoning Image Source: https://arxiv.org/pdf/1902.09506.pdf

Benchmarks

16k/mAP2D Classification/mAP2D Object Detection/mAP2D Semantic Segmentation/zR@1002D Semantic Segmentation/zR@202D Semantic Segmentation/zR@503D/mAPGraph Question Answering/AccuracyObject Detection/mAPScene Graph Generation/zR@100Scene Graph Generation/zR@20Scene Graph Generation/zR@50Scene Parsing/zR@100Scene Parsing/zR@20Scene Parsing/zR@50Visual Question Answering/AccuracyVisual Question Answering (VQA)/Accuracy

Related Benchmarks

GQA Test2019/Visual Question Answering (VQA)/AccuracyGQA Test2019/Visual Question Answering (VQA)/BinaryGQA Test2019/Visual Question Answering (VQA)/ConsistencyGQA Test2019/Visual Question Answering (VQA)/DistributionGQA Test2019/Visual Question Answering (VQA)/OpenGQA Test2019/Visual Question Answering (VQA)/PlausibilityGQA Test2019/Visual Question Answering (VQA)/ValidityGQA test-dev/Visual Question Answering (VQA)/AccuracyGQA test-std/Visual Question Answering (VQA)/AccuracyGQA-REX/Explanatory Visual Question Answering/BLEU-4GQA-REX/Explanatory Visual Question Answering/CIDErGQA-REX/Explanatory Visual Question Answering/GQA-testGQA-REX/Explanatory Visual Question Answering/GQA-valGQA-REX/Explanatory Visual Question Answering/GroundingGQA-REX/Explanatory Visual Question Answering/METEORGQA-REX/Explanatory Visual Question Answering/ROUGE-LGQA-REX/Explanatory Visual Question Answering/SPICEGQA-REX/Visual Question Answering/BLEU-4GQA-REX/Visual Question Answering/CIDErGQA-REX/Visual Question Answering/GQA-testGQA-REX/Visual Question Answering/GQA-valGQA-REX/Visual Question Answering/GroundingGQA-REX/Visual Question Answering/METEORGQA-REX/Visual Question Answering/ROUGE-LGQA-REX/Visual Question Answering/SPICEGQA-REX/Visual Question Answering (VQA)/BLEU-4GQA-REX/Visual Question Answering (VQA)/CIDErGQA-REX/Visual Question Answering (VQA)/GQA-testGQA-REX/Visual Question Answering (VQA)/GQA-valGQA-REX/Visual Question Answering (VQA)/GroundingGQA-REX/Visual Question Answering (VQA)/METEORGQA-REX/Visual Question Answering (VQA)/ROUGE-LGQA-REX/Visual Question Answering (VQA)/SPICE

Statistics

Papers
749
Benchmarks
17

Links

Homepage

Tasks

16k2D Classification2D Object Detection2D Semantic Segmentation3DGraph Question AnsweringObject DetectionScene Graph GenerationScene ParsingVisual Question AnsweringVisual Question Answering (VQA)