Tasks SotA Datasets Papers Methods Submit About

Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable Benchmarks All SotA Datasets Papers Methods

Community

Submit Results About

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Datasets/Visual Genome

Visual Genome

ImagesTextsCC BY 4.0Introduced 2017-01-01

Visual Genome contains Visual Question Answering data in a multi-choice setting. It consists of 101,174 images from MSCOCO with 1.7 million QA pairs, 17 questions per image on average. Compared to the Visual Question Answering dataset, Visual Genome represents a more balanced distribution over 6 question types: What, Where, When, Who, Why and How. The Visual Genome dataset also presents 108K images with densely annotated objects, attributes and relationships.

Source: RaAM: A Relation-aware Attention Model for Visual Question Answering Image Source: Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations

Benchmarks

16k/MAP 2D Classification/MAP 2D Classification/Average mAP 2D Object Detection/MAP 2D Semantic Segmentation/R@100 2D Semantic Segmentation/R@50 2D Semantic Segmentation/mR@100 2D Semantic Segmentation/mR@50 2D Semantic Segmentation/Recall@50 2D Semantic Segmentation/mean Recall @20 2D Semantic Segmentation/Recall@100 2D Semantic Segmentation/Recall@20 2D Semantic Segmentation/mean Recall @100 2D Semantic Segmentation/zR@100 2D Semantic Segmentation/zR@20 2D Semantic Segmentation/zR@50 2D Semantic Segmentation/ng-mR@20 2D Semantic Segmentation/mR@20 2D Semantic Segmentation/F@100 3D/MAP Dense Captioning/mAP Image Classification/Average mAP Multi-Label Image Classification/Average mAP Object Detection/MAP Phrase Grounding/Pointing Game Accuracy Scene Graph Generation/Recall@50 Scene Graph Generation/mean Recall @20 Scene Graph Generation/Recall@100 Scene Graph Generation/Recall@20 Scene Graph Generation/mean Recall @100 Scene Graph Generation/R@100 Scene Graph Generation/mR@100 Scene Graph Generation/mR@50 Scene Graph Generation/zR@100 Scene Graph Generation/zR@20 Scene Graph Generation/zR@50 Scene Graph Generation/ng-mR@20 Scene Graph Generation/mR@20 Scene Graph Generation/F@100 Scene Parsing/R@100 Scene Parsing/R@50 Scene Parsing/mR@100 Scene Parsing/mR@50 Scene Parsing/Recall@50 Scene Parsing/mean Recall @20 Scene Parsing/Recall@100 Scene Parsing/Recall@20 Scene Parsing/mean Recall @100 Scene Parsing/zR@100 Scene Parsing/zR@20 Scene Parsing/zR@50 Scene Parsing/ng-mR@20 Scene Parsing/mR@20 Scene Parsing/F@100 Scene Understanding/R@100 Scene Understanding/R@50 Scene Understanding/mR@100 Scene Understanding/mR@50 Visual Relationship Detection/R@100 Visual Relationship Detection/R@50 Visual Relationship Detection/mR@100 Visual Relationship Detection/mR@50

Related Benchmarks

Visual Genome (pairs)/Visual Question Answering (VQA)/Percentage correct Visual Genome (subjects)/Visual Question Answering (VQA)/Percentage correct Visual Genome 128x128/Image Generation/FID Visual Genome 128x128/Image Generation/Inception Score Visual Genome 128x128/Image Generation/SceneFID Visual Genome 256x256/Image Generation/FID Visual Genome 256x256/Image Generation/Inception Score Visual Genome 256x256/Image Generation/LPIPS Visual Genome 64x64/Image Generation/FID Visual Genome 64x64/Image Generation/Inception Score

Statistics

Papers: 1,256
Benchmarks: 62

Links

Tasks

16k 2D Classification 2D Object Detection 2D Semantic Segmentation 3D Bidirectional Relationship Classification Dense Captioning Image Classification Image Generation from Scene Graphs Layout-to-Image Generation Multi-Label Image Classification Multi-label Image Recognition with Partial Labels Object Detection Phrase Grounding Predicate Classification Scene Graph Classification Scene Graph Detection Scene Graph Generation Scene Parsing Scene Understanding Unbiased Scene Graph Generation Unsupervised KG-to-Text Generation Unsupervised semantic parsing Visual Question Answering (VQA)Visual Relationship Detection