SpatialVOC2K

A multilingual image dataset with spatial relation annotations and object features for image-to-text generation, built using 2,026 images from the PASCAL VOC2008 dataset.

Source: SpatialVOC2K: A Multilingual Dataset of Images with Annotations and Features for Spatial Relations between Objects