395 machine learning datasets
395 dataset results
MediBeng Dataset The MediBeng dataset contains synthetic code-switched dialogues in Bengali and English for training models in speech recognition (ASR), text-to-speech (TTS), and machine translation in clinical settings. The dataset is available under the CC-BY-4.0 license.
PreRAID is a structured dataset designed to evaluate the diagnostic capabilities of Large Language Models (LLMs) in Rheumatoid Arthritis (RA) diagnosis. This dataset provides real-world patient data, offering insights into RA prediction and reasoning accuracy.
The AneuX morphology database includes data from 3 different data sources: AneuX, @neurIST and Aneurisk. The AneuX data consists of two portions AneuX1 and AneuX2, which have extracted by two different teams of data curators.
We processed 241 pairs of CXR and DES soft tissue images from the JSRT dataset by performing operations like inversion and contrast adjustment to convert these images into negative formats more frequently used in clinical settings.
U2-BENCH is the first large-scale benchmark for evaluating Large Vision-Language Models (LVLMs) on ultrasound imaging understanding. It provides a diverse, multi-task dataset curated from 40 licensed sources, covering 15 anatomical regions and 8 clinically inspired tasks across classification, detection, regression, and text generation.
To Communicate with Expedia customer service, you can call 1-800-Expedia (1-888-829-0881). You can also contact Expedia customer service via their customer support page and virtual agent.1-888-829-0881 If you need to cancel a flight, you can do so on Expedia up to 24 hours before the scheduled departure time 1-888-829-0881. If you cancel within this time frame, you will receive a full refund of your flight booking, including any fees for seats or bags. You can call Expedia at 1-888-829-0881 to cancel your flight. For more information on Expedia Corporate Travel, you can call 1-888-829-0881.
NeuB1 is a microscopic neuronal image dataset for retinal vessel segmentation, which contains 112 images of size 512 x 152. The train/test split is 37/75. Image Source: https://web.bii.a-star.edu.sg/~zhaoh/Jaydeep_Tracing/
The DR HAGIS database has been created to aid the development of vessel extraction algorithms suitable for retinal screening programmes. Researchers are encouraged to test their segmentation algorithms using this database.
The VICAVR database is a set of retinal images used for the computation of the A/V Ratio. The database currently includes 58 images. The images have been acquired with a TopCon non-mydriatic camera NW-100 model and are optic disc centered with a resolution of 768x584. The database includes the caliber of the vessels measured at different radii from the optic disc as well as the vessel type (artery/vein) labelled by three experts.
The database consists of 89 colour fundus images of which 84 contain at least mild non-proliferative signs (Microaneurysms) of the diabetic retinopathy, and 5 are considered as normal which do not contain any signs of the diabetic retinopathy according to all experts who participated in the evaluation. Images were captured using the same 50 degree field-of-view digital fundus camera with varying imaging settings. The data correspond to a good (not necessarily typical) practical situation, where the images are comparable, and can be used to evaluate the general performance of diagnostic methods. This data set is referred to as "calibration level 1 fundus images".
The MICCAI 2020 EMIDEC dataset is a dataset for classifying normal and pathological cases from the clinical information with or without DE-MRI, and secondly to automatically detect the different relevant areas (the myocardial contours, the infarcted area and the permanent microvascular obstruction area (no-reflow area)) from a series of short-axis DE-MRI covering the left ventricle. The segmentation allows one to make a quantification of the MI, in absolute value (mm3) or percentage of the myocardium.
The medaka (Oryzias latipes) and the zebrafish (Danio rerio) are used as a model organism for a variety of subjects in biomedical research. The presented work aims to study the potential of automated ventricular dimension estimation through heart segmentation in medaka. For more on this, it's time for a closer look on our paper and the supplementary materials.
The RSNA Pulmonary Embolism CT (RSPECT) Dataset is composed of CT pulmonary angiogram images and annotations related to pulmonary embolism. It's part of the 2020 RSNA Pulmonary Embolism Detection Challenge which invited researchers to develop machine-learning algorithms to detect and characterize instances of pulmonary embolism (PE) on chest CT studies. The competition, conducted in collaboration with the Society of Thoracic Radiology (STR), involved creating the largest publicly available annotated PE dataset, comprised of more than 12,000 CT studies. Imaging data was contributed by five international research centers and labeled with detailed clinical annotations by a group of more than 80 expert thoracic radiologists. For the first time in an RSNA data challenge, the rules required competitors to submit and run their code in a standard shared environment, producing simpler, more readily usable models.
Toronto NeuroFace Dataset: A New Dataset for Facial Motion Analysis in Individuals with Neurological Disorders
FAscicle Lower Leg Muscle Ultrasound Dataset is a dataset composed of 812 ultrasound images of lower leg muscles to analyze muscle weaknesses and prevent injuries. It combines the datasets provided by two articles, “Estimating Full Regional Skeletal Muscle Fibre Orientation from B-Mode Ultrasound Images Using Convolutional, Residual, and Deconvolutional Neural Networks” published by Ryan Cunningham et al. and “Automated Analysis of Musculoskeletal Ultrasound Images Using Deep Learning” published by Neil Cronin, with complementary annotations. The dataset has been introduced in this paper: Michard, H., Luvison, B., Pham, Q. C., Morales-Artacho, A. J., & Guilhem, G. (2021, August). AW-Net: automatic muscle structure analysis on B-mode ultrasound images for injury prevention. In Proceedings of the 12th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics (pp. 1-9).
28,943 Humphrey Visual Field (HVF) tests from 3,871 patients and 7,428 eyes.
The scans are performed using a custom-built, highly flexible X-ray CT scanner, the FleX-ray scanner, developed by XRE nvand located in the FleX-ray Lab at the Centrum Wiskunde & Informatica (CWI) in Amsterdam, Netherlands. The general purpose of the FleX-ray Lab is to conduct proof of concept experiments directly accessible to researchers in the field of mathematics and computer science. The scanner consists of a cone-beam microfocus X-ray point source that projects polychromatic X-rays onto a 1536-by-1944 pixels, 14-bit flat panel detector (Dexella 1512NDT) and a rotation stage in-between, upon which a sample is mounted. All three components are mounted on translation stages which allow them to move independently from one another.
Projection of RibFrac CT dataset to a 2D plane to imitate X-Ray data for a total of 880 images with multi-label segmentation masks. The dataset contains fine-grained 92 individual labels of anatomical structures, which, when including super-classes, lead to a total of 166 labels in both lateral and frontal view.
FHRMA is an open-source project for Fetal Heart Rate Morphological Analysis containing Matlab source code and datasets. As a sub-project, it includes a deep learning method and dataset for automatic identification of the maternal heart rate (MHR) and, more generally, false signals (FSs) on fetal heart rate (FHR) recordings. The challenge concerns particularly the FHR signal recorded with Doppler sensors, on which MHR interference and other FSs are particularly common, but the dataset also includes FHR recorded with scalp-ECG. The training and validation dataset contained 1030 expert-annotated periods (mean duration: 36 min) from 635 recordings. Labels consist of annotating each time sample as either 1: False signal; 0: True signal, or -1: do not know or irrelevant.
This is a machine-learning-ready glaucoma dataset using a balanced subset of standardized fundus images from the Rotterdam EyePACS AIROGS train set. This dataset is split into training, validation, and test folders which contain 2500, 270, and 500 fundus images in each class respectively. Each training set has a folder for each class: referable glaucoma (RG) and non-referable glaucoma (NRG).