Datasets

3,275 machine learning datasets

3,275 dataset results

ULI-RI (Unreal Labeled Images for Person Re-ID)

The ULI-RI dataset is generated using the Unreal Engine 4 to simulate various outdoor environments with 115 high-quality 3D human models. For each person identity, we controlled and quantitatively labeled the illumination intensity, view point (model z-rotation angle), and background to create 512 images. There are total 115 x 512 = 58880 images in the ULI-RI dataset.

1 papers0 benchmarksImages

SynFoot

50K synthetic renders of the human foot, with surface normals, masks and keypoints.

1 papers0 benchmarksImages

PWISeg (PWISeg Surgical Instruments Dataset)

Overview The Surgical Instruments Recognition Dataset is a groundbreaking collection of high-resolution images (1280x960 pixels) specifically designed for the recognition and categorization of surgical instruments. This dataset captures the intricate details and complexity of surgical tools, particularly when arranged in scenarios reminiscent of an operating room.

1 papers0 benchmarksImages, Medical

MID Intrinsics

Intrinsic component extension of MIT Multi-Illumination Dataset proposed in the paper "Intrinsic Image Decomposition via Ordinal Shading", Chris Careaga and Yağız Aksoy, ACM Transactions on Graphics, 2023

1 papers0 benchmarksImages

LPBA40 (LONI Probabilistic Brain Atlas)

Click to add a brief description of the dataset (Markdown and LaTeX enabled).

1 papers0 benchmarks3D, Images, MRI, Medical

cryoPPP (CryoPPP: A Large Expert-Curated Cryo-EM Image Dataset for Machine Learning Protein Particle Picking)

The CryoPPP dataset consists of 34 ground truth data and metadata for 335 EMPIAR IDs. The ground truth data is comprised of a variety of 9893 Micrographs (~300 cryo-EM images per EMPIAR ID) with manually curated ground truth coordinates of picked protein particles. The metadata consists of 1,698,802 high-resolution micrographs deposited in EMPIAR with their respective FPT and Globus data download paths.

1 papers0 benchmarksImages

ITCPR dataset (Image-Text Composed Person Retrieval dataset)

The ITCPR dataset is a comprehensive collection specifically designed for the Zero-Shot Composed Person Retrieval (ZS-CPR) task. It consists of a total of 2,225 annotated triplets, derived from three distinct datasets: Celeb-reID, PRCC, and LAST.

1 papers6 benchmarksImages, Texts

CLCXray (Cutters and Liquid Containers X-ray Dataset)

The CLCXray dataset contains 9,565 X-ray images, in which 4,543 X-ray images (real data) are obtained from the real subway scene and 5,022 X-ray images (simulated data) are scanned from manually designed baggage. There are 12 categories in the CLCXray dataset, including 5 types of cutters and 7 types of liquid containers. Five kinds of cutters include blade, dagger, knife, scissors, swiss army knife. Seven kinds of liquid containers include cans, carton drinks, glass bottle, plastic bottle, vacuum cup, spray cans, tin. The annotations are made in COCO format.

1 papers1 benchmarksImages

Algonauts 2023

The Algonauts 2023 Challenge focuses on predicting responses in the human brain as participants perceive complex natural visual scenes. Through collaboration with the Natural Scenes Dataset (NSD) team, the Challenge runs on the largest suitable brain dataset available, opening new venues for data-hungry modeling.

1 papers0 benchmarksImages, MRI

VigSet

A pioneering dataset for vignette removal. Vigset includes 983 pairs of both vignetting and vignetting-free high-resolution (5340×3697) real-world images under various conditions.

1 papers0 benchmarksImages

YFCC-CelebA

The scales of the data accessible through internet search engines can reach hundreds of millions, or even billions. The existence of such large weak-labeled databases has gained importance in the training of face recognition algorithms. Starting with the publicly available YFCC100M, we propose a weakly-labeled subset for multi-label face recognition for self-supervised methods. A 392K image subset of YFCC100M of 128x128 images was obtained by querying for the 40 facial attributes. We made this dataset publicly available.

1 papers0 benchmarksImages

MapReader Data (in GeoHumanities workshop, SIGSPATIAL 2022)

MapReader in GeoHumanities workshop (SIGSPATIAL 2022): Gold standards and outputs

1 papers0 benchmarksImages, Texts

RadioGalaxyNET Dataset

Automating the creation of catalogues for radio galaxies in next-generation deep surveys necessitates the identification of components within extended sources and their respective infrared hosts. We present RadioGalaxyNET, a multimodal dataset, tailored for machine learning tasks to streamline the automated detection and localization of multi-component extended radio galaxies and their associated infrared hosts. The dataset encompasses 4,155 instances of galaxies across 2,800 images, incorporating both radio and infrared channels. Each instance furnishes details about the extended radio galaxy class, a bounding box covering all components, a pixel-level segmentation mask, and the keypoint position of the corresponding infrared host galaxy. RadioGalaxyNET is the first dataset to include images from the highly sensitive Australian Square Kilometre Array Pathfinder (ASKAP) radio telescope, corresponding infrared images, and instance-level annotations for galaxy detection.

1 papers1 benchmarksImages

Synthetic Soccer NeRF Dataset

Synthetic dataset comprising three different environments for multi-camera dynamic novel view synthesis for soccer. This dataset is made compatible for Nerfstudio, and includes data parsers with various settings to reproduce the settings of our paper "Dynamic NeRFs for Soccer Scenes" and more.

1 papers0 benchmarks3D, Images, Videos

topex-printer

topex-printer is a dataset containing 102 machine parts of a label printing machine. It includes these parts for two domains, real photos and CAD rendered models.

1 papers0 benchmarksImages

Neural Field Arena - Classification

Neural fields (NeFs) have recently emerged as a versatile method for modeling signals of various modalities, including images, shapes, and scenes. Subsequently, many works have explored the use of NeFs as representations for downstream tasks, e.g. classifying an image based on the parameters of a NeF that has been fit to it. However, the impact of the NeF hyperparameters on their quality as downstream representation is scarcely understood and remains largely unexplored. This is partly caused by the large amount of time required to fit datasets of neural fields.

1 papers0 benchmarks3D, Images

GOD (Generic Object Decoding)

The Generic Object Decoding (GOD) Dataset is a specialized resource developed for fMRI-based decoding. It aggregates fMRI data gathered through the presentation of images from 200 representative object categories, originating from the 2011 fall release of ImageNet. The training session incorporated 1,200 images (8 per category from 150 distinct object categories). In contrast, the test session included 50 images (one from each of the 50 object categories). It is noteworthy that the categories in the test session were unique from those in the training session and were introduced in a randomized sequence across runs. On five subjects the fMRI scanning was conducted.

1 papers1 benchmarksImages, Medical, fMRI

ODSI-DB (ODSI-DB – Oral and Dental Spectral Image Database)

ODSI-DB is an image database of oral and dental reflectance spectral images of human test subjects. Image sets of the test subjects contain the front-view and the occlusal surfaces of lower and upper teeth, oral mucosa, and face surrounding the mouth. Other features-of-interest have been imaged on case-by-case basis. The spectral images in the database have been annotated by dental experts.

1 papers0 benchmarksImages

Polarimetric Imaging for Perception

The dataset includes polarimetric, RGB and depth automotive (on the road) data.

1 papers0 benchmarksImages, LiDAR

REBUS (A Robust Evaluation Benchmark of Understanding Symbols)

Recent advances in large language models have led to the development of multimodal LLMs (MLLMs), which take both image data and text as an input. Virtually all of these models have been announced within the past year, leading to a significant need for benchmarks evaluating the abilities of these models to reason truthfully and accurately on a diverse set of tasks. When Google announced Gemini (Gemini Team et al., 2023), they showcased its ability to solve rebuses—wordplay puzzles which involve creatively adding and subtracting letters from words derived from text and images. The diversity of rebuses allows for a broad evaluation of multimodal reasoning capabilities, including image recognition, multi- step reasoning, and understanding the human creator’s intent. We present REBUS: a collection of 333 hand-crafted rebuses spanning 13 diverse cate- gories, including hand-drawn and digital images created by nine contributors. Samples are presented in Table 1. Notably, GPT-4V, the most powe

1 papers1 benchmarksImages, Texts

PreviousPage 134 of 164Next