COCO-Facet

ImagesTextsCC BY 4.0Introduced 2025-05-21

COCO-Facet is a benchmark for attribute-focused text-to-image retrieval, comprising 9,112 queries with 100 candidate images for each. The images are from COCO images, and the annotations are from available annotations of COCO images (COCO, Visual7W, VisDial, COCO-Stuff).