395 machine learning datasets
395 dataset results
This dataset consists of more than 16,000 retinal OCT B-scans from 441 cases (Normal: 120, Drusen: 160, CNV: 161) and is acquired at Noor Eye Hospital, Tehran, Iran. Images are labeled by a retinal specialist.
The provided dataset consists of high-quality realistic head models and combined EEG/MEG data which can be used for state-of-the-art methods in brain research, such as modern finite element methods (FEM) to compute the EEG/MEG forward problems using the software toolbox DUNEuro (http://duneuro.org).
The BCSS dataset contains over 20,000 segmentation annotations of tissue regions from breast cancer images from The Cancer Genome Atlas (TCGA). This large-scale dataset was annotated through the collaborative effort of pathologists, pathology residents, and medical students using the Digital Slide Archive. It enables the generation of highly accurate machine-learning models for tissue segmentation.
This is the supplemental data for our paper on how to benchmark registrations of serial sections with ground truths. There are three main modalities and one further, as a reference.
Human fibrosarcoma HT1080WT (ATCC) cells at low cell densities embedded in 3D collagen type I matrices [1]. The time-lapse videos were recorded every 2 minutes for 16.7 hours and covered a field of view of 1002 pixels × 1004 pixels with a pixel size of 0.802 μm/pixel The videos were pre-processed to correct frame-to-frame drift artifacts, resulting in a final size of 983 pixels × 985 pixels pixels.
deepMTJ: Muscle-Tendon Junction Tracking in Ultrasound Images deepMTJ is a machine learning approach for automatically tracking of muscle-tendon junctions (MTJ) in ultrasound images. Our method is based on a convolutional neural network trained to infer MTJ positions across various ultrasound systems from different vendors, collected in independent laboratories from diverse observers, on distinct muscles and movements. We built deepMTJ to support clinical biomechanists and locomotion researchers with an open-source tool for gait analyses.
1、 Competition name:
The MedVidCL dataset contains a collection of 6, 617 videos annotated into ‘medical instructional’, ‘medical non-instructional' and ‘non-medical’ classes. A two-step approach is used to construct the MedVidCL dataset. In the first step, the videos annotated by health informatics experts are used to train a machine learning model that predicts the given video to one of the three aforementioned classes. In the second step, only the high-confidence videos are used and health informatics experts assess the model’s predicted video category and update the category wherever needed.
The dataset X of this work is an extension of the heartSeg dataset. Each sample x ∈ X is an RGB image capturing the heart region of Medaka (Oryzias latipes) hatchlings from a constant ventral view. Since the body of Medaka is see-through, noninvasive studies regarding the internal organs and the whole circulatory system are practicable. A Medaka’s heart contains three parts: the atrium, the ventricle, and the bulbus. The atrium receives deoxygenated blood from the circulatory system and delivers it to the ventricle, which forwards it into the bulbus. The bulbus is the heart’s exit chamber and provides the gill arches with a constant blood flow. The blood flow through these three chambers was captured in 63 short recordings (around 11 seconds with 24 frames per second each) in total, from which the single image samples x ∈ X are extracted. The dataset is split into training and test data following the heartSeg dataset with ntrain = 565 samples in the training set Xtrain and ntest = 165
At the Isfahan Fertility and Infertility Center, semen samples were collected from fifteen patients. The sperm samples were fixed and stained using the Diff-Quick method. Using an Olympus CX21 microscope with a ×100 objective lens and a ×10 eyepiece and a Sony color camera (Model No SSC-DC58AP), 725 images were taken. The resolution of each image was 576×720 pixels. From these images, the sperm heads were cropped and classified into five classes by three specialists. The classes are Normal, Pyriform, Tapered, Amorphous, and Others. After the classification, only the samples which there was a collective consensus about their class were kept in the dataset. Four classes of Normal, Pyriform, Tapered, and Amorphous are included in this dataset. The resulting dataset of sperm heads denoted as Human Sperm Head Morphology dataset (HuSHeM) consists of four folders, each corresponding to a specific set of sperm shapes. The folder names reflect the shape of the contained images. There are 54
Dataset of sperm head images with expert-classification labels. The dataset contains 1854 sperm head images obtained from six semen smears and classified by three Chilean referent domain experts according to World Health Organization (WHO) criteria, in one of the following classes: normal, tapered, pyriform, small and amorphous. This gold-standard is aimed for use in evaluating and comparing not only known techniques, but also future improvements to present approaches for classification of human sperm heads for semen analysis.
Te NVALT-8 study (m=200 participants) examined if nadroparin combined with chemotherapy could reduce cancer relapse after surgical removal of a non-small cell lung tumour.
The NVALT-11 study considered the effect of profylactic brain radiation versus observation in ($m$=174) patients with advanced non-small cell lung cancer.
VinDr-PCXR is an open, large-scale pediatric chest X-ray dataset for interpretation of common thoracic diseases in children. The dataset contains 9,125 CXR scans retrospectively collected from a major pediatric hospital in Vietnam between 2020 and 2021. Each scan was manually annotated by a pediatric radiologist who has more than ten years of experience. The dataset was labeled for the presence of 36 critical findings and 15 diseases. It aims to aid research in the detection of multiple findings and diseases.
Onchocerciasis is causing blindness in over half a million people in the world today. Drug development for the disease is crippled as there is no way of measuring effectiveness of the drug without an invasive procedure. Drug efficacy measurement through assessment of viability of onchocerca worms requires the patients to undergo nodulectomy which is invasive, expensive, time-consuming, skill-dependent, infrastructure dependent and lengthy process.
Introduction The China Physiological Signal Challenge 2019 (CPSC 2019) aims to encourage the development of algorithms for challenging QRS detection and heart rate (HR) estimation from short-term single-lead ECG recordings usually with low signal quality and/or abnormal rhythm waveforms.
Introduction Abnormality of cardiac conduction system can induce arrhythmia. Abnormal heart rhythm can lead to other cardiac diseases and complications, and can be life-threatening 1. There are various types of arrhythmias and each type is associated with a pattern, and as such, it is possible to be identified. Arrhythmias can be classified into two major categories. The first category consists of arrhythmias formed by a single irregular heartbeat in electrocardiogram (ECG), herein called morphological arrhythmia, while another category consists of arrhythmias formed by a set of irregular heartbeats in ECG, herein called rhythmic arrhythmias 2. Dynamic electrocardiogram (DCG), like ECG Holter, provides an important way to monitor the incidences of arrhythmias in daily life, facilitating the doctors to check a total number and distribution of arrhythmias in a long time and thus to provide the required therapy to prevent further problems. The 3rd China Physiological Signal Challenge 2020
Introduction The 4th China Physiological Signal Challenge 2021 (CPSC 2021) aims to encourage the development of algorithms for searching the paroxysmal atrial fibrillation (PAF) events from dynamic ECG recordings.
MODA is a large open-source dataset of high quality, human-scored sleep spindles (5342 spindles, from 180 subjects) that was produced by the Massive Online Data Annotation project. Sleep spindles were detected as a consensus of a number of human-expert scorers. With a median number of 5 experts scoring every EEG segment, MODA offers sleep spindle annotations of a quality unseen in previous datasets.
Several datasets are fostering innovation in higher-level functions for everyone, everywhere. By providing this repository, we hope to encourage the research community to focus on hard problems. In this repository, we present our medical imaging DICOM files of patients from our User Tests and Analysis 4 (UTA4) study. Here, we provide a dataset of the used medical images during the UTA4 tasks. This repository and respective dataset should be paired with the dataset-uta4-rates repository dataset. Work and results are published on a top Human-Computer Interaction (HCI) conference named AVI 2020 (page). Results were analyzed and interpreted on our Statistical Analysis charts. The user tests were made in clinical institutions, where clinicians diagnose several patients for a Single-Modality vs Multi-Modality comparison. For example, in these tests, we used both prototype-single-modality and prototype-multi-modality repositories for the comparison. On the same hand, the hereby dataset repres