Datasets

395 machine learning datasets

395 dataset results

Colorectal Adenoma

Colorectal Adenoma contains 177 whole slide images (156 contain adenoma) gathered and labelled by pathologists from the Department of Pathology, The Chinese PLA General Hospital.

2 papers0 benchmarksImages, Medical

RSDD-Time is a dataset of 598 manually annotated self-reported depression diagnosis posts from Reddit that include temporal information about the diagnosis. Annotations include whether a mental health condition is present and how recently the diagnosis happened. Additionally, the dataset includes exact temporal spans that relate to the date of diagnosis.

2 papers0 benchmarksMedical, Time series

SemClinBr (A multi‑institutional and multi‑specialty semantically annotated corpus for Portuguese clinical NLP tasks)

Background: The high volume of research focusing on extracting patient information from electronic health records (EHRs) has led to an increase in the demand for annotated corpora, which are a precious resource for both the development and evaluation of natural language processing (NLP) algorithms. The absence of a multipurpose clinical corpus outside the scope of the English language, especially in Brazilian Portuguese, is glaring and severely impacts scientific progress in the biomedical NLP field. Methods: In this study, a semantically annotated corpus was developed using clinical text from multiple medical specialties, document types, and institutions. In addition, we present, (1) a survey listing common aspects, differences, and lessons learned from previous research, (2) a fine-grained annotation schema that can be replicated to guide other annotation initiatives, (3) a web-based annotation tool focusing on an annotation suggestion feature, and (4) both intrinsic and extrinsic ev

2 papers1 benchmarksMedical, Texts

Ward2ICU

Ward2ICU is a vital signs dataset of inpatients from the general ward. It contains vital signs with class labels indicating patient transitions from the ward to intensive care units

2 papers0 benchmarksMedical, Time series

MHSMA (The Modified Human Sperm Morphology Analysis)

The MHSMA dataset is a collection of human sperm images from 235 patients with male factor infertility. Each image is labeled by experts for normal or abnormal sperm acrosome, head, vacuole, and tail.

2 papers0 benchmarksImages, Medical

Endotect Polyp Segmentation Challenge Dataset

A challenge that consists of three tasks, each targeting a different requirement for in-clinic use. The first task involves classifying images from the GI tract into 23 distinct classes. The second task focuses on efficiant classification measured by the amount of time spent processing each image. The last task relates to automatcially segmenting polyps.

2 papers3 benchmarksBiomedical, Images, Medical

DeepFluoroLabeling-IPCAI2020

This collection contains data and code associated with the IPCAI/IJCARS 2020 paper “Automatic Annotation of Hip Anatomy in Fluoroscopy for Robust and Efficient 2D/3D Registration.” The data hosted here consists of annotated datasets of actual hip fluoroscopy, CT and derived data from six lower torso cadaveric specimens. Documentation and examples for using the dataset and Python code for training and testing the proposed models are also included. Higher-level information, including clinical motivations, prior works, algorithmic details, applications to 2D/3D registration, and experimental details, may be found in the companion paper which is available at https://arxiv.org/abs/1911.07042 or https://doi.org/10.1007/s11548-020-02162-7. We hope that this code and data will be useful in the development of new computer-assisted capabilities that leverage fluoroscopy.

2 papers0 benchmarks3D, Biomedical, Images, Medical

PhysioNet Challenge 2016

Introduction The 2016 PhysioNet/CinC Challenge aims to encourage the development of algorithms to classify heart sound recordings collected from a variety of clinical or nonclinical (such as in-home visits) environments. The aim is to identify, from a single short recording (10-60s) from a single precordial location, whether the subject of the recording should be referred on for an expert diagnosis.

2 papers0 benchmarksAudio, Medical

THYME-2016

2 papers1 benchmarksMedical, Texts

Kvasir-Capsule

Kvasir-Capsule dataset is the largest publicly released VCE dataset. In total, the dataset contains 47,238 labeled images and 117 videos, where it captures anatomical landmarks and pathological and normal findings. The results is more than 4,741,621 images and video frames altogether.

2 papers0 benchmarksBiomedical, Images, Medical

EPISURG (EPISURG: a dataset of postoperative MRI for quantitative analysis of resection neurosurgery for refractory epilepsy)

EPISURG is a clinical dataset of $T_1$-weighted magnetic resonance images (MRI) from 430 epileptic patients who underwent resective brain surgery at the National Hospital of Neurology and Neurosurgery (Queen Square, London, United Kingdom) between 1990 and 2018.

2 papers0 benchmarks3D, Images, MRI, Medical

Tc1 Mouse cerebellum atlas (Tc1 Mouse cerebellum atlas with Purkinje layer segmentation)

This mouse cerebellar atlas can be used for mouse cerebellar morphometry.

2 papers0 benchmarks3D, Biomedical, Images, MRI, Medical

DisKnE (Disease Knowledge Evaluation)

DisKnE is a benchmark for Disease Knowledge Evaluation built from MedNLI and MEDIQA-NLI. This benchmark is constructed to specifically test the medical reasoning capabilities of ML models, such as mapping symptoms to diseases.

2 papers0 benchmarksMedical, Texts

EchoCP

EchoCP is an echocardiography dataset in cTTE targeting PFO (Patent foramen ovale) diagnosis. EchoCP consists of 30 patients with both rest and Valsalva maneuver videos which covers various PFO grades.

2 papers0 benchmarksMedical

CENTER-TBI (Collaborative European NeuroTrauma Effectiveness Research in TBI)

The CENTER-TBI database contains prospectively collected data of more than 4,500 patients with TBI in Europe. The Registry and Acute Care data has been collected during a 3 years’ period (2015-2017) in 65 centers in Europe. For all patients, outcome data has been collected up to 2 years after injury.

2 papers0 benchmarksMedical

Chest x-ray landmark dataset

Set of landmark annotations for JSRT, Montgomery, Shenzhen and a subset of Padchest datasets

2 papers0 benchmarksMedical

RETOUCH (RETOUCH -The Retinal OCT Fluid Detection and Segmentation Benchmark and Challenge)

The goal of the challenge is to compare automated algorithms that are able to detect and segment various types of fluids on a common dataset of optical coherence tomography (OCT) volumes representing different retinal diseases, acquired with devices from different manufacturers. We made available a dataset of OCT volumes containing a wide variety of retinal fluid lesions with accompanying reference annotations. We invite the medical imaging community to participate by developing and testing existing and novel automated retinal OCT segmentation methods.

2 papers0 benchmarks3D, Medical

BreastClassifications4 ([MIMBCD-UI] UTA4: Severity & Pathology Classifications Dataset)

Several datasets are fostering innovation in higher-level functions for everyone, everywhere. By providing this repository, we hope to encourage the research community to focus on hard problems. In this repository, we present the real results severity (BIRADS) and pathology (post-report) classifications provided by the Radiologist Director from the Radiology Department of Hospital Fernando Fonseca while diagnosing several patients (see dataset-uta4-dicom) from our User Tests and Analysis 4 (UTA4) study. Here, we provide a dataset for the measurements of both severity (BIRADS) and pathology classifications concerning the patient diagnostic. Work and results are published on a top Human-Computer Interaction (HCI) conference named AVI 2020 (page). Results were analyzed and interpreted from our Statistical Analysis charts. The user tests were made in clinical institutions, where clinicians diagnose several patients for a Single-Modality vs Multi-Modality comparison. For example, in these t

2 papers0 benchmarksBiomedical, Images, Medical, Tabular

OADAT (OADAT: Experimental and Synthetic Clinical Optoacoustic Data for Standardized Image Processing)

An experimental and synthetic (simulated) OA raw signals and reconstructed image domain datasets rendered with different experimental parameters and tomographic acquisition geometries.

2 papers0 benchmarksImages, Medical

Breast Lesion Detection in Ultrasound Videos (CVA-Net)

The breast lesion detection in ultrasound videos dataset uses a clip-level and video-level feature aggregated network (CVA-Net) and consists of 188 ultrasound videos, of which 113 are labeled malignant and 75 benign. Overall these consist of 25,272 ultrasound images in total with the number of images for each video varying from 28 to 413. 150 videos were used for training, 38 for testing. The primary intended use case would be for computer-aided breast cancer diagnosis, supporting systems to assist radiologists.

2 papers0 benchmarksImages, Medical, Videos

PreviousPage 12 of 20Next