TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Datasets/CLEVR

CLEVR

Compositional Language and Elementary Visual Reasoning

ImagesTextsCC BY 4.0Introduced 2016-12-20

CLEVR (Compositional Language and Elementary Visual Reasoning) is a synthetic Visual Question Answering dataset. It contains images of 3D-rendered objects; each image comes with a number of highly compositional questions that fall into different categories. Those categories fall into 5 classes of tasks: Exist, Count, Compare Integer, Query Attribute and Compare Attribute. The CLEVR dataset consists of: a training set of 70k images and 700k questions, a validation set of 15k images and 150k questions, a test set of 15k images and 150k questions about objects, answers, scene graphs and functional programs for all train and validation images and questions. Each object present in the scene, aside of position, is characterized by a set of four attributes: 2 sizes: large, small, 3 shapes: square, cylinder, sphere, 2 material types: rubber, metal, 8 color types: gray, blue, brown, yellow, red, green, purple, cyan, resulting in 96 unique combinations.

Source: On transfer learning using a MAC model variant Image Source: Johnson et al

Benchmarks

Image Generation/FID-5k-training-stepsVisual Question Answering/AccuracyVisual Question Answering (VQA)/Accuracy

Related Benchmarks

CLEVR Counts/Image Clustering/AccuracyCLEVR-Humans/Visual Question Answering (VQA)/AccuracyCLEVR-Ref+/Instance Segmentation/IoUCLEVR-Ref+/Referring Expression Segmentation/IoUCLEVR-SV/Text-To-Image/LPIPSCLEVR-X/Explanation Generation/AccCLEVR-X/Explanation Generation/B4CLEVR-X/Explanation Generation/CCLEVR-X/Explanation Generation/MCLEVR-X/Explanation Generation/RLCLEVR/Count/Image Classification/Top 1 AccuracyCLEVR/Dist/Image Classification/Top 1 AccuracyCLEVRER/Visual Reasoning/Average-per ques.CLEVRER/Visual Reasoning/Counterfactual-per opt.CLEVRER/Visual Reasoning/Counterfactual-per ques.CLEVRER/Visual Reasoning/DescriptiveCLEVRER/Visual Reasoning/Explanatory-per opt.CLEVRER/Visual Reasoning/Explanatory-per ques.CLEVRER/Visual Reasoning/Predictive-per opt.CLEVRER/Visual Reasoning/Predictive-per ques.ClevrTex/Instance Segmentation/MSEClevrTex/Instance Segmentation/mIoUClevrTex/Unsupervised Object Segmentation/MSEClevrTex/Unsupervised Object Segmentation/mIoU

Statistics

Papers
657
Benchmarks
3

Links

Homepage

Tasks

Image GenerationVisual Question AnsweringVisual Question Answering (VQA)Visual Question Answering (VQA) Split AVisual Question Answering (VQA) Split B