Datasets

3,275 machine learning datasets

3,275 dataset results

U-10: United-10 COVID19 CT Dataset

This dataset supports the research detailed in the pre-print "Virtual Imaging Trials Improved the Transparency and Reliability of AI Systems in COVID-19 Imaging." The study employs both clinical and simulated CT data to evaluate AI models for COVID-19 diagnosis. By leveraging the Virtual Imaging Trials (VIT) framework, the research addresses reproducibility and generalizability issues prevalent in medical imaging AI models.

1 papers1 benchmarks3D, Images, Medical

cranfield-synthetic-drone-detection

A fully synthetic dataset of drones generated using structured domain randomization. It contains multiple datasets generated using different styles: - Drones only - Drones and Birds - Generic Distractors - Realistic Distractors - Random Backgrounds

1 papers0 benchmarksImages

COMFORT (Consistent Multilingual Frame of Reference Test)

COMFORT is an evaluation protocol to systematically assess the spatial reasoning capabilities of VLMs.

1 papers0 benchmarksImages, Texts

Training data for "Harnessing Machine Learning for Single-Shot Measurement of Free Electron Laser Pulse Power"

This repository contains data for the NeurIPS conference paper titled "Harnessing Machine Learning for Single-Shot Measurement of Free Electron Laser Pulse Power".

1 papers0 benchmarksImages, Physics, Tabular

OVIC Datasets (Open Vocabulary Image Classification Datasets)

Due to the free-form nature of the open vocabulary image classification task, special annotations are required for image sets used for evaluation purposes. Three such image datasets are presented here:

1 papers0 benchmarksImages

DAVIS-Edit

DAVIS-Edit is a curated testing benchmark for video editing. This dataset contains two evaluation settings, i.e., text- and image-based editing. Besides, it offers two types of annotated for both modalities of prompts, considering the editing scenarios with similar (DAVIS-Edit-S) and changing (DAVIS-Edit-C) shapes, so as to address the shape inconsistency problem in video-to-video editing.

1 papers0 benchmarksImages, Texts, Videos

CARLA2Real

This dataset contains synthetic images extracted from the CARLA simulator along with rich information extracted from the deferred rendering pipeline of Unreal Engine 4. The main purpose of this dataset is the training of the state-of-the-art image-to-image translation model proposed by Intel Labs "Enhancing Photorealism Enhancement" (EPE). Translation results derived from the model targeting the characteristics of Cityscapes, KITTI, and Mapillary Vistas are also provided. Computer vision-based models trained on these data are expected to perform better when deployed in the real world.

1 papers0 benchmarksImages

PlantSeg

We established a large-scale plant disease segmentation dataset named PlantSeg. PlantSeg comprises more than 11,400 images of 115 different plant diseases from various environments, each annotated with its corresponding segmentation label for diseased parts. To the best of our knowledge, PlantSeg is the largest plant disease segmentation dataset containing in-the-wild images. Our dataset enables researchers to evaluate their models and provides a valid foundation for the development and benchmarking of plant disease segmentation algorithms.

1 papers0 benchmarksImages

WiFiCam

WiFiCam dataset for through-wall imaging based on WiFi channel state information. The corresponding source code repository is located at: https://github.com/StrohmayerJ/wificam

1 papers0 benchmarksEnvironment, Images, RGB Video, Time series

Digital Typhoon Dataset V2

Click to add a brief description of the dataset (Markdown and LaTeX enabled).

1 papers0 benchmarksImages, Time series

SuSy Dataset

The SuSy Dataset combines authentic photographs and AI-generated images designed for training and evaluating synthetic image detection models. It contains over 25,000 images from six different sources, including real-world photographs from COCO and synthetic images created by state-of-the-art diffusion models such as DALL-E 3, Midjourney, and Stable Diffusion.

1 papers0 benchmarksImages

AutoTherm

Temporal Dataset for Indoor and In-Vehicle Thermal Comfort Estimation Abstract Thermal comfort estimation is essential for enhancing user experience in static indoor environments and dynamic in-vehicle scenarios. While traditional datasets focus on buildings, their application to fast-changing conditions, such as in vehicles, remains unexplored. We address this gap by introducing two temporal datasets collected from (1) a self-built climatic chamber with 31 sensor signals and user-labeled ratings from 18 participants and (2) in-vehicle studies with 20 participants in a BMW 3 Series.

1 papers0 benchmarksAudio, EEG, Images, Time series, Tracking

OmniLab

In order to evaluate the effectiveness of NToP in real-world scenarios, we collect a new dataset OmniLab with a top-view omnidirectional camera, mounted on the ceiling of two different rooms (bedroom, living room) at 2.5 m height. Five actors (3 males, 2 females) perform 15 actions from CMU-MoCap database (brooming, cleaning windows, down and get up, drinking, fall-on-face, in chair and stand up, pull object, push object, rugpull, turn left, turn right, upbend from knees, upbend from waist, up from ground, walk, walk-old-man) in two rooms with varying clothes. The recorded action length is 2.5 s, which results in 60 images for each scene at a frame rate of 24 FPS. The position of the camera is fixed and the resolution of the images is 1200 by 1200 pixels. A total of 4800 frames are collected. All annotations of 17 keypoints conforming to COCO conventions are estimated through a keypoint detector and subsequently refined by four different humans in two loops to ensure high annotation qu

1 papers0 benchmarksActions, Images

NToP

Human pose estimation (HPE) in the top-view using fisheye cameras presents a promising and innovative application domain. However, the availability of datasets capturing this viewpoint is extremely limited, especially those with high-quality 2D and 3D keypoint annotations. Addressing this gap, we leverage the capabilities of Neural Radiance Fields (NeRF) technique to establish a comprehensive pipeline for generating human pose datasets from existing 2D and 3D datasets, specifically tailored for the top-view fisheye perspective. Through this pipeline, we create a novel dataset NToP (NeRF-powered Top-view human Pose dataset for fisheye cameras) with over 570 thousand images, and conduct an extensive evaluation of its efficacy in enhancing neural networks for 2D and 3D top-view human pose estimation. Extensive evaluations on existing top-view 2D and 3D HPE datasets as well as our new real-world top-view 2D HPE dataset OmniLab prove that our dataset is effective and exceeds previous datase

1 papers0 benchmarks3d meshes, Images

ROVER

Click to add a brief description of the dataset (Markdown and LaTeX enabled).

1 papers0 benchmarksImages

Artificial fluorescent bacteria dataset

These images consist of a series of bacteria of the type Bacillus Subtilis that are suspended and captured by a digital microscope. The fluorescent bacteria dataset can be created as desired, defining the number of bacteria per image and the total number of images. It comes with 3280x2464 resolution images and centroid locations of each bacteria, useful for enumeration or density map estimation.

1 papers0 benchmarksImages

EGO-CH-Gaze (Learning to Detect Attended Objects in Cultural Sites with Gaze Signals and Weak Object Supervision)

To study the problem of weakly supervised attended object detection in cultural sites, we collected and labeled a dataset of egocentric images acquired from subjects visiting a cultural site. The dataset has been designed to offer a snapshot of the subject’s visual experience while visiting a museum and contains labels for several artworks and details attended by the subjects.

1 papers0 benchmarksEnvironment, Images, Videos

DrIFT

Click to add a brief description of the dataset (Markdown and LaTeX enabled).

1 papers0 benchmarksImages

SpaceSGG

Scene Graph Generation (SGG) converts visual scenes into structured graph representations, providing deeper scene understanding for complex vision tasks. However, existing SGG models often overlook essential spatial relationships and struggle with generalization in open-vocabulary contexts. To address these limitations, we propose LLaVA-SpaceSGG, a multimodal large language model (MLLM) designed for open-vocabulary SGG with enhanced spatial relation modeling. To train it, we collect the SGG instruction-tuning dataset, named SpaceSGG. This dataset is constructed by combining publicly available datasets and synthesizing data using open-source models within our data construction pipeline. It combines object locations, object relations, and depth information, resulting in three data formats: spatial SGG description, question-answering, and conversation. To enhance the transfer of MLLMs' inherent capabilities to the SGG task, we introduce a two-stage training paradigm. Experiments show that

1 papers0 benchmarksImages, Texts

IllusionMNIST_test

IllusionMNIST_test Dataset Characteristics IllusionMNIST_test is a generated dataset derived from the MNIST dataset. It introduces a novel element of pareidolia—a phenomenon where patterns, often faces, are perceived in random or abstract stimuli. The dataset contains 11 classes: the original 10 digits from MNIST, and an additional "No Illusion" class. It includes 1,219 samples, all synthetically created rather than real-world images.

1 papers0 benchmarksImages, Texts

PreviousPage 143 of 164Next