TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Datasets/GenEval

GenEval

ImagesTextsMIT licenseIntroduced 2023-10-17

Recent breakthroughs in diffusion models, multimodal pretraining, and efficient finetuning have led to an explosion of text-to-image generative models. Given human evaluation is expensive and difficult to scale, automated methods are critical for evaluating the increasingly large number of new models. However, most current automated evaluation metrics like FID or CLIPScore only offer a holistic measure of image quality or image-text alignment, and are unsuited for fine-grained or instance-level analysis. In this paper, we introduce GenEval, an object-focused framework to evaluate compositional image properties such as object co-occurrence, position, count, and color. We show that current object detection models can be leveraged to evaluate text-to-image models on a variety of generation tasks with strong human agreement, and that other discriminative vision models can be linked to this pipeline to further verify properties like object color. We then evaluate several open-source text-to-image models and analyze their relative generative capabilities on our benchmark. We find that recent models demonstrate significant improvement on these tasks, though they are still lacking in complex capabilities such as spatial relations and attribute binding. Finally, we demonstrate how GenEval might be used to help discover existing failure modes, in order to inform development of the next generation of text-to-image models. Our code to run the GenEval framework is publicly available at this https URL.

Benchmarks

1 Image, 2*2 Stitchi/Overall1 Image, 2*2 Stitchi/Single Obj.1 Image, 2*2 Stitchi/Two Obj.1 Image, 2*2 Stitchi/Color Attri.1 Image, 2*2 Stitchi/Colors1 Image, 2*2 Stitchi/Counting1 Image, 2*2 Stitchi/Position10-shot image generation/Overall10-shot image generation/Single Obj.10-shot image generation/Two Obj.10-shot image generation/Color Attri.10-shot image generation/Colors10-shot image generation/Counting10-shot image generation/PositionImage Generation/OverallImage Generation/Single Obj.Image Generation/Two Obj.Image Generation/Color Attri.Image Generation/ColorsImage Generation/CountingImage Generation/PositionText-to-Image Generation/OverallText-to-Image Generation/Single Obj.Text-to-Image Generation/Two Obj.Text-to-Image Generation/Color Attri.Text-to-Image Generation/ColorsText-to-Image Generation/CountingText-to-Image Generation/Position

Statistics

Papers
100
Benchmarks
28

Links

Homepage

Tasks

1 Image, 2*2 Stitchi10-shot image generationImage GenerationText-to-Image Generation