19,997 machine learning datasets
19,997 dataset results
Recent times have witnessed an increasing number of applications of deep neural networks towards solving tasks that require superior cognitive abilities, e.g., playing Go, generating art, ChatGPT, etc. Such a dramatic progress raises the question: how generalizable are neural networks in solving problems that demand broad skills? To answer this question, we propose SMART: a Simple Multimodal Algorithmic Reasoning Task (and the associated SMART-101 dataset) for evaluating the abstraction, deduction, and generalization abilities of neural networks in solving visuo-linguistic puzzles designed specifically for children of younger age (6--8). Our dataset consists of 101 unique puzzles; each puzzle comprises a picture and a question, and their solution needs a mix of several elementary skills, including pattern recognition, algebra, and spatial reasoning, among others. To train deep neural networks, we programmatically augment each puzzle to 2,000 new instances; each instance varied in appea
Large Scale Multi-Illuminant (LSMI) Dataset for Developing White Balance Algorithm under Mixed Illumination (ICCV 2021) <!-- ABOUT THE PROJECT -->
The data and audio included here were collected for the Soundscape Attributes Translation Project (SATP). First introduced in Aletta et. al. (2020), the SATP is an attempt to provide validated translations of soundscape attributes in languages other than English. The recordings were used for headphones - based listening experiments.
Peir Gross (Jing et al., 2018) was collected with descriptions in the Gross sub-collection from PEIR digital library, resulting in 7.442 image-caption pairs from 21 different sub-categories. Each caption contains only one sentence.
A dataset of 12-lead ECGs with annotations. The dataset contains 345 779 exams from 233 770 patients. It was obtained through stratified sampling from the CODE dataset ( 15% of the patients). The data was collected by the Telehealth Network of Minas Gerais in the period between 2010 and 2016.
Rad-ReStruct is a fine-grained structured reporting dataset for Chest X-Ray images. The structured reporting process is modeled as a hierarchical VQA task and the task is recognizing different findings in different body regions and predicting their attributes.
DialogStudio, a meticulously curated collection of dialogue datasets. These datasets are unified under a consistent format while retaining their original information. We incorporate domain-aware prompts and identify dataset licenses, making DialogStudio an exceptionally rich and diverse resource for dialogue research and model training.
The official dataset contains a training set (137 images), a validation set (4 images), and a testing set (10 images)
**SuperCLUE is a Chinese language model evaluation benchmark named after another popular Chinese LLM benchmark CLUE. SuperCLUE encompasses three sub-tasks: actual users' queries and ratings derived from an LLM battle platform (CArena), open-ended questions with single and multiple-turn dialogues (OPEN), and closed-ended questions with the same stems as open-ended single-turn ones (CLOSE).
We present a further analysis of visual modality incompleteness, benchmarking latest MMEA models on our proposed dataset MMEA-UMVM.
We collected a new low-light raw denoising (LRD) dataset for training and benchmarking. In contrast to the SID dataset, which sets a fixed exposure time to capture long and short exposure images, we captured long and short exposure images based on the exposure value (EV). Motivated by multi-exposure image fusion, the exposure value for long exposure images was set to 0, and the exposure value for short exposure was set to the commonly used parameters -1, -2, and -3. The dataset is designed for application to low-light raw image denoising and low-light raw image synthesis. The dataset contains both indoor and outdoor scenes. For each scene instance, we first captured a long-exposure image at ISO 100 to get a noise-free reference image. Then we captured multiple short-exposure images using different ISO levels and EVs, with a 1-2 second interval between subsequent images to wait for the sensor to cool down, thus avoiding unexpected noise introduced by sensor heating.
An extension of DAIR-V2X with addition temporal information
MedShapeNet contains over 100,000 medical shapes, including bones, organs, vessels, muscles, etc., as well as surgical instruments. You can search, display them in 3D and download the individual shapes by using our shape search engine. Note that MedShapeNet is provided for research and educational purposes only.
LSMDC-E contains 20,151 training samples, 1,477 validation samples and 2,005 test samples, which is modified from LSMDC 2021. We take the first four sentences in every five-sentence story as the story context and the last sentence as the story ending. As every sentence relates to a movie frame set in LSMDC, we take the last frame set as the ending-related image set for IgSEG.
The RAD-ChestCT dataset is a large medical imaging dataset developed by Duke MD/PhD Rachel Draelos during her Computer Science PhD supervised by Lawrence Carin. The full dataset includes 35,747 chest CT scans from 19,661 adult patients. The public Zenodo repository contains an initial release of 3,630 chest CT scans, approximately 10% of the dataset. This dataset is of significant interest to the machine learning and medical imaging research communities.
OpenImage-O is built for the ID dataset ImageNet-1k. It is manually annotated, comes with a naturally diverse distribution, and has a large scale. It is built to overcome several shortcomings of existing OOD benchmarks. OpenImage-O is image-by-image filtered from the test set of OpenImage-V3, which has been collected from Flickr without a predefined list of class names or tags, leading to natural class statistics and avoiding an initial design bias.
Diverse datasets for offline multi-agent reinforcement learning research. Includes datasets for popular MARL benchmark environments such as:
VideoCube is a high-quality and large-scale benchmark to create a challenging real-world experimental environment for Global Instance Tracking (GIT). MGIT is a high-quality and multi-modal benchmark based on VideoCube-Tiny to fully represent the complex spatio-temporal and causal relationships coupled in longer narrative content.
Enlarge the dataset to understand how image background effect the Computer Vision ML model. With the following topics: Blur Background / Segmented Background / AI generated Background/ Bias of tools during annotation/ Color in Background / Dependent Factor in Background/ LatenSpace Distance of Foreground/ Random Background with Real Environment!
RealCQA Scientific Chart Question Answering as a Test-bed for First-Order Logic