Datasets

3,275 machine learning datasets

3,275 dataset results

WiRLD_ (Wikidata Reference Logo Dataset)

The Wikidata Reference Logo Dataset (WiRLD), a comprehensive collection of reference logos specifically designed to address the challenges of large-scale logo identification. Recognizing the limitations of existing logo datasets, which often have a restricted number of logo classes or lack public availability, the authors curated WiRLD to facilitate research on more realistic, large-scale logo identification tasks. WiRLD contains 100,000 reference logo images sourced from Wikidata, representing 100,000 distinct logo classes. Each entity in the dataset has one corresponding logo image. The dataset's focus on providing a vast and readily accessible collection of reference logos makes it particularly valuable for evaluating one-shot logo identification methods, especially in large-scale scenarios.

1 papers0 benchmarksImages

DalleStreet

A dataset of images obtained from DALL-E 3 for 67 countries and 10 concept classes, similar to DollarStreet images.

1 papers0 benchmarksImages, Texts

FP4S (Floor plan image segmentation via scribble-based semi-weakly-supervised learning)

We introduce a new style- and category-agnostic floor plan image parsing benchmark developed in collaboration with professional architectural designers. This benchmark includes 25 categories of space and adjacency labels (19 space elements and 6 adjacency elements), offering a more diverse and comprehensive representation of common design elements across various graphical styles and design categories. It sets a new standard for the level of diversity and complexity of floor plan image parsing tasks oriented towards real-world applications, far exceeding the scope of existing datasets. This benchmark is available at https://doi.org/10.7910/DVN/MDIRHE.

1 papers2 benchmarksImages

WorldCuisines

Vision Language Models (VLMs) often struggle with culture-specific knowledge, particularly in languages other than English and in underrepresented cultural contexts. To evaluate their understanding of such knowledge, we introduce WorldCuisines, a massive-scale benchmark for multilingual and multicultural, visually grounded language understanding. This benchmark includes a visual question answering (VQA) dataset with text-image pairs across 30 languages and dialects, spanning 9 language families and featuring over 1 million data points, making it the largest multicultural VQA benchmark to date. It includes tasks for identifying dish names and their origins. We provide evaluation datasets in two sizes (12k and 60k instances) alongside a training dataset (1 million instances). Our findings show that while VLMs perform better with correct location context, they struggle with adversarial contexts and predicting specific regional cuisines and languages. To support future research, we release

1 papers0 benchmarksImages, Texts

DeformPAM-Dataset (Dataset of DeformPAM)

Two versions of the dataset are offered: one is the full dataset used to train the models in DeformPAM, and the other is a mini dataset for easier examination. Both datasets include data for the supervised and finetuning stages of granular pile shaping, rope shaping, and T-shirt unfolding.

1 papers0 benchmarksActions, Images, Point cloud

Omni-Image

Omni-Image is built as a challenging but tractable dataset for continual learning and few-shot learning.

1 papers0 benchmarksImages

Natural Environments Dataset

We provide synthetic reflectance, direct shading (shading due to surface geometry and illumination conditions), ambient light and shadow cast ground-truth images. The dataset contains garden/park like natural (out-door) scenes including trees, plants, bushes, fences, etc. Furthermore, scenes are rendered with different types of terrains, landscapes, and lighting conditions. Addition-ally, real HDR sky images with a parallel light source are used to provide realistic ambient light. Moreover, light source properties are designed to model daytime lighting conditions to enrich the photometric effects.

1 papers0 benchmarksImages

ShapeNet Intrinsic Images v2.0 Extended

The synthetic ShapeNet intrinsic image decomposition dataset of 90,000 images. 50,000 of them were used for training the deep CNN models of CVIU'2021 - see Section 4 of the paper. This is the extension of the first release of the synthetic ShapeNet intrinsic image decomposition dataset of 20,000 images used for training the deep CNN models IntrinsicNet and RetiNet of CVPR'2018. See Section 4.1 of the CVPR paper for the details of the data rendering.

1 papers0 benchmarksImages

PolyMATH

PolyMATH, a challenging benchmark aimed at evaluating the general cognitive reasoning abilities of MLLMs. PolyMATH comprises 5,000 manually collected high-quality images of cognitive textual and visual challenges across 10 distinct categories, including pattern recognition, spatial reasoning, and relative reasoning

1 papers0 benchmarksImages, Texts

Dynamic Appearance Dataset

We study dynamic appearance models of both relightable (BRDF) and non-relightable (RGB). For both we introduce new pilot datasets, allowing, for the first time, to study such phenomena: For RGB we provide 22 dynamic textures acquired from free online sources; For BRDFs, we further acquire a dataset of 21 flash-lit videos of time-varying materials, enabled by a simple-to-construct setup.

1 papers0 benchmarksImages, Videos

DRIFT (Domain-Adaptive Regression for Forest Monitoring)

The DRIFT dataset includes 25k image patches collected in five European countries sourced from aerial and nanosatellite image archives. Each image patch is associated with three target variables to predict:

1 papers0 benchmarksHyperspectral images, Images

FurnitureBench DiffIK sim demos

Demonstration data for 4 FurnitureBench tasks collected with a SpaceMouse using a DiffIK Controller.

1 papers0 benchmarksImages

MVX (Multimodal V2X)

MVX incorporates realistic physical world simulation with a differentiable accurate ray tracing wireless simulation that includes multi-agent and multimodal datasets for AI-driven digital twin applications in vehicular communication systems.

1 papers1 benchmarksImages, LiDAR, Tabular, Videos

SPOT-10 (Animal Pattern Benchmark Dataset for Machine Learning Algorithms)

The SPOTS-10 dataset is an extensive collection of grayscale images showcasing diverse patterns found in ten animal species. Specifically, SPOTS-10 contains 50,000 32 x 32 grayscale images, divided into ten categories, with 5,000 images per category. The training set comprises 40,000 images, while the test set contains 10,000 images. The SPOTS-10 dataset is freely available on the project GitHub page: https://github.com/amotica/spots-10 by cloning the repository.

1 papers1 benchmarksImages

Uncertainty Quantification for Underwater Object Segmentation

This dataset extends the Semantic Segmentation of Underwater Imagery: Dataset and Benchmark, adding an uncertainty evaluation component. To facilitate uncertainty analysis, the test set incorporates a comprehensive range of perturbations, inspired by Benchmarking Neural Network Robustness to Common Corruptions and Perturbations, applied at four intensity levels. These perturbations, which preserve the original ground truth labels, encompass variations in Brightness and Contrast (simulating diverse lighting and object coloration), Gaussian and Shot Noise (reflecting low-light and discrete light properties), and Impulse Noise (resulting from bit errors). Additionally, Defocus, Motion, and Zoom Blurs are included, along with Elastic Transformations, Pixelation from upscaling, and JPEG Compression artifacts. This enhanced dataset enables an in-depth evaluation of model robustness, providing valuable insights into performance under a wide range of challenging, real-world underwater condit

1 papers0 benchmarksImages

CausalChaos! (CausalChaos!QA)

CausalChaos! is a dataset for causal video question answering. It is based on Tom and Jerry cartoons. It features longer causal chains embedded in dynamic visual scenes. It also features challenging incorrect options, especially, Causal Confusion set which contains causally confounding incorrect options. All these factors prove to be challenging for current VLMs and other traditional Video Question Answering models.

1 papers0 benchmarksImages, RGB Video, Videos

ICIAR 2018 Grand Challenge on Breast Cancer Histology Images

The dataset is composed of Hematoxylin and eosin (H&E) stained breast histology microscopy and whole-slide images. Challenge participants should evaluate the performance of their method on either/both sets of images.

1 papers2 benchmarksImages

StainDoc

The datasets used in our WACV paper High-Fidelity Document Stain Removal via A Large-Scale Real-World Dataset and A Memory-Augmented Transformer. StainDoc is a newly created document stain dataset. StainDoc_mark and StainDoc_seal are synthetic datasets.

1 papers0 benchmarksImages

IndraEye (IndraEye: Infrared Electro-Optical Drone-based Aerial Object Detection Dataset)

Deep neural networks (DNNs) have demonstrated superior performance when trained on well-illuminated environments, given that the images are captured through an Electro-Optical (EO) camera, which offers rich texture content. In critical applications such as aerial surveillance, maintaining consistent reliability of DNNs throughout all times of the day is paramount, including during low-light conditions where EO cameras often struggle to capture relevant details. Furthermore, UAV-based aerial object detection encounters significant scale variability stemming from varying altitudes and slant angles, introducing an additional layer of complexity. Existing approaches consider only illumination change/style variations as the domain shift, while in aerial surveillance, correlation shifts also acts as a hindrance to the performance of DNNs. In this paper we propose a multi-sensor (EO-IR) labelled object detection dataset consisting of 5276 images with 142991 instances covering multiple viewing

1 papers0 benchmarksImages

PASSION dataset (PASSION derm 2024 dataset)

Overview PASSION derm is a pioneering initiative dedicated to closing the diversity gap in dermatology datasets. This project provides a unique dataset of skin condition images from Sub-Saharan Africa, with a focus on richly pigmented skin. The dataset is designed to emulate teledermatology settings and includes images of common pediatric skin conditions, such as eczema, fungal infections, scabies, and impetigo, in diverse quality and resolution. PASSION derm aims to improve access to dermatologic care in regions with limited healthcare resources.

1 papers0 benchmarksImages, Medical

PreviousPage 142 of 164Next