19,997 machine learning datasets
19,997 dataset results
IU X-ray (Demner-Fushman et al., 2016) is a set of chest X-ray images paired with their corresponding diagnostic reports. The dataset contains 7,470 pairs of images and reports. A simplified version of IU X-Ray through CPIR-MR prompting was presented at AAAI 2025: AAAI Article Link. Dataset Link: SMR IU X-Ray.
There has been a rapidly growing interest in Automatic Symptom Detection (ASD) and Automatic Diagnosis (AD) systems in the machine learning research literature, aiming to assist doctors in telemedicine services. These systems are designed to interact with patients, collect evidence about their symptoms and relevant antecedents, and possibly make predictions about the underlying diseases. Doctors would review the interactions, including the evidence and the predictions, collect if necessary additional information from patients, before deciding on next steps. Despite recent progress in this area, an important piece of doctors' interactions with patients is missing in the design of these systems, namely the differential diagnosis. Its absence is largely due to the lack of datasets that include such information for models to train on. In this work, we present a large-scale synthetic dataset of roughly 1.3 million patients that includes a differential diagnosis, along with the ground truth
In this project, we formally present the task of Open-domain Visual Entity recognitioN (OVEN), where a model need to link an image onto a Wikipedia entity with respect to a text query. We construct OVEN-Wiki by re-purposing 14 existing datasets with all labels grounded onto one single label space: Wikipedia entities. OVEN challenges models to select among six million possible Wikipedia entities, making it a general visual recognition benchmark with the largest number of labels.
WanJuan is a large-scale training corpus that includes multiple modalities. The dataset incorporates text, image-text, and video modalities, with a total volume exceeding 2TB.
A benchmark dataset for out-of-distribution detection. ImageNet-1k is in-distribution, while SUN is out-of-distribution.
Multi-modal Large Language Models (MLLMs) are increasingly prominent in the field of artificial intelligence. Although many benchmarks attempt to holistically evaluate MLLMs, they typically concentrate on basic reasoning tasks, often yielding only simple yes/no or multi-choice responses. These methods naturally lead to confusion and difficulties in conclusively determining the reasoning capabilities of MLLMs. To mitigate this issue, we manually curate CORE-MM benchmark dataset, specifically designed for MLLMs with a focus on complex reasoning tasks. Our benchmark comprises three key reasoning categories: deductive, abductive, and analogical reasoning. The queries in our dataset are intentionally constructed to engage the reasoning capabilities of MLLMs in the process of generating answers. For a fair comparison across various MLLMs, we incorporate intermediate reasoning steps into our evaluation criteria. CORE-MM benchmark consists of 279 manually curated reasoning questions, associate
KinFaceW-I dataset contains 533 pairs of facial images of persons with a kin relation. Four different kin relations are considered in the dataset: father and daughter (F-D) with 134 pairs, father and son (F-S) with 156 pairs, mother and daughter (M-D) with 127 pairs, mother and son (M-S) with 116 pairs. Each sample is composed of one parent face image and one child face image.
CValues is a Chinese human values evaluation benchmark designed to assess the alignment of Chinese Large Language Models (LLMs) with human values. Let me provide you with more details:
dataset of 400 image pairs
Amazon Clothing (Amazon Clothing 5-core)
2010 i2b2/VA is a biomedical dataset for relation classification and entity typing.
Within the SemEval-2013 evaluation exercise, the TempEval-3 shared task aims to advance research on temporal information processing. It follows on from TempEval-1 and -2, with: a three-part structure covering temporal expression, event, and temporal relation extraction; a larger dataset; and new single measures to rank systems – in each task and in general.
CASIA-B is a large multiview gait database, which is created in January 2005. There are 124 subjects, and the gait data was captured from 11 views. Three variations, namely view angle, clothing and carrying condition changes, are separately considered. Besides the video files, we still provide human silhouettes extracted from video files. The detailed information about Dataset B and an evaluation framework can be found in this paper .
Yeast dataset consists of a protein-protein interaction network. Interaction detection methods have led to the discovery of thousands of interactions between proteins, and discerning relevance within large-scale data sets is important to present-day biology.
N/A
This dataset includes reviews (ratings, text, helpfulness votes), product metadata (descriptions, category information, price, brand, and image features), and links (also viewed/also bought graphs).
Kennedy Space Center is a dataset for the classification of wetland vegetation at the Kennedy Space Center, Florida using hyperspectral imagery. Hyperspectral data were acquired over KSC on March 23, 1996 using JPL's Airborne Visible/Infrared Imaging Spectrometer.
The dataset collected at the University of Florence during 2012, has been captured using a Kinect camera. It includes 9 activities: wave, drink from a bottle, answer phone,clap, tight lace, sit down, stand up, read watch, bow. During acquisition, 10 subjects were asked to perform the above actions for 2/3 times. This resulted in a total of 215 activity samples.
ICDAR2017 is a dataset for scene text detection.
The Replay-Mobile Database for face spoofing consists of 1190 video clips of photo and video attack attempts to 40 clients, under different lighting conditions. These videos were recorded with current devices from the market -- an iPad Mini2 (running iOS) and a LG-G4 smartphone (running Android). This Database was produced at the Idiap Research Institute (Switzerland) within the framework of collaboration with Galician Research and Development Center in Advanced Telecommunications - Gradiant (Spain).