TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Datasets

395 machine learning datasets

Filter by Modality

  • Images3,275
  • Texts3,148
  • Videos1,019
  • Audio486
  • Medical395
  • 3D383
  • Time series298
  • Graphs285
  • Tabular271
  • Speech199
  • RGB-D192
  • Environment148
  • Point cloud135
  • Biomedical123
  • LiDAR95
  • RGB Video87
  • Tracking78
  • Biology71
  • Actions68
  • 3d meshes65
  • Tables52
  • Music48
  • EEG45
  • Hyperspectral images45
  • Stereo44
  • MRI39
  • Physics32
  • Interactive29
  • Dialog25
  • Midi22
  • 6D17
  • Replay data11
  • Financial10
  • Ranking10
  • Cad9
  • fMRI7
  • Parallel6
  • Lyrics2
  • PSG2
Clear filter

395 dataset results

Labeled Retinal Optical Coherence Tomography Dataset for Classification of Normal, Drusen, and CNV Cases

This dataset consists of more than 16,000 retinal OCT B-scans from 441 cases (Normal: 120, Drusen: 160, CNV: 161) and is acquired at Noor Eye Hospital, Tehran, Iran. Images are labeled by a retinal specialist.

1 papers0 benchmarksImages, Medical

WWU DUNEuro reference data set (The WWU DUNEuro reference data set for combined EEG/MEG source analysis)

The provided dataset consists of high-quality realistic head models and combined EEG/MEG data which can be used for state-of-the-art methods in brain research, such as modern finite element methods (FEM) to compute the EEG/MEG forward problems using the software toolbox DUNEuro (http://duneuro.org).

1 papers0 benchmarks3d meshes, EEG, Medical

BCSS (Breast Cancer Semantic Segmentation)

The BCSS dataset contains over 20,000 segmentation annotations of tissue regions from breast cancer images from The Cancer Genome Atlas (TCGA). This large-scale dataset was annotated through the collaborative effort of pathologists, pathology residents, and medical students using the Digital Slide Archive. It enables the generation of highly accurate machine-learning models for tissue segmentation.

1 papers0 benchmarksBiomedical, Images, Medical

Evaluating registrations of serial sections with distortions of the ground truths. Supplemental data

This is the supplemental data for our paper on how to benchmark registrations of serial sections with ground truths. There are three main modalities and one further, as a reference.

1 papers0 benchmarks3D, Biomedical, Medical

HT1080WT cells - 3D collagen type I matrices (HT1080WT cells embedded in 3D collagen type I matrices - manual annotations for cell instance segmentation and tracking)

Human fibrosarcoma HT1080WT (ATCC) cells at low cell densities embedded in 3D collagen type I matrices [1]. The time-lapse videos were recorded every 2 minutes for 16.7 hours and covered a field of view of 1002 pixels × 1004 pixels with a pixel size of 0.802 μm/pixel The videos were pre-processed to correct frame-to-frame drift artifacts, resulting in a final size of 983 pixels × 985 pixels pixels.

1 papers0 benchmarksMedical, Videos

deepMTJ (Muscle-Tendon Junction Tracking in Ultrasound Images)

deepMTJ: Muscle-Tendon Junction Tracking in Ultrasound Images deepMTJ is a machine learning approach for automatically tracking of muscle-tendon junctions (MTJ) in ultrasound images. Our method is based on a convolutional neural network trained to infer MTJ positions across various ultrasound systems from different vendors, collected in independent laboratories from diverse observers, on distinct muscles and movements. We built deepMTJ to support clinical biomechanists and locomotion researchers with an open-source tool for gait analyses.

1 papers1 benchmarksImages, Medical

MRSpineSeg Challenge

1、 Competition name:

1 papers0 benchmarks3D, Images, Medical

MedVidCL (Medical Video Classification)

The MedVidCL dataset contains a collection of 6, 617 videos annotated into ‘medical instructional’, ‘medical non-instructional' and ‘non-medical’ classes. A two-step approach is used to construct the MedVidCL dataset. In the first step, the videos annotated by health informatics experts are used to train a machine learning model that predicts the given video to one of the three aforementioned classes. In the second step, only the high-confidence videos are used and health informatics experts assess the model’s predicted video category and update the category wherever needed.

1 papers0 benchmarksMedical, Texts, Videos

Extended heartSeg

The dataset X of this work is an extension of the heartSeg dataset. Each sample x ∈ X is an RGB image capturing the heart region of Medaka (Oryzias latipes) hatchlings from a constant ventral view. Since the body of Medaka is see-through, noninvasive studies regarding the internal organs and the whole circulatory system are practicable. A Medaka’s heart contains three parts: the atrium, the ventricle, and the bulbus. The atrium receives deoxygenated blood from the circulatory system and delivers it to the ventricle, which forwards it into the bulbus. The bulbus is the heart’s exit chamber and provides the gill arches with a constant blood flow. The blood flow through these three chambers was captured in 63 short recordings (around 11 seconds with 24 frames per second each) in total, from which the single image samples x ∈ X are extracted. The dataset is split into training and test data following the heartSeg dataset with ntrain = 565 samples in the training set Xtrain and ntest = 165

1 papers1 benchmarksBiology, Biomedical, Medical, Videos

HuSHeM (Human Sperm Head Morphology Dataset)

At the Isfahan Fertility and Infertility Center, semen samples were collected from fifteen patients. The sperm samples were fixed and stained using the Diff-Quick method. Using an Olympus CX21 microscope with a ×100 objective lens and a ×10 eyepiece and a Sony color camera (Model No SSC-DC58AP), 725 images were taken. The resolution of each image was 576×720 pixels. From these images, the sperm heads were cropped and classified into five classes by three specialists. The classes are Normal, Pyriform, Tapered, Amorphous, and Others. After the classification, only the samples which there was a collective consensus about their class were kept in the dataset. Four classes of Normal, Pyriform, Tapered, and Amorphous are included in this dataset. The resulting dataset of sperm heads denoted as Human Sperm Head Morphology dataset (HuSHeM) consists of four folders, each corresponding to a specific set of sperm shapes. The folder names reflect the shape of the contained images. There are 54

1 papers0 benchmarksImages, Medical

SCIAN (SCIAN Gold-standard for Morphological Sperm Analysis)

Dataset of sperm head images with expert-classification labels. The dataset contains 1854 sperm head images obtained from six semen smears and classified by three Chilean referent domain experts according to World Health Organization (WHO) criteria, in one of the following classes: normal, tapered, pyriform, small and amorphous. This gold-standard is aimed for use in evaluating and comparing not only known techniques, but also future improvements to present approaches for classification of human sperm heads for semen analysis.

1 papers0 benchmarksImages, Medical

NVALT-8

Te NVALT-8 study (m=200 participants) examined if nadroparin combined with chemotherapy could reduce cancer relapse after surgical removal of a non-small cell lung tumour.

1 papers0 benchmarksMedical, Tabular

NVALT-11

The NVALT-11 study considered the effect of profylactic brain radiation versus observation in ($m$=174) patients with advanced non-small cell lung cancer.

1 papers0 benchmarksMedical, Tabular

VinDr-PCXR

VinDr-PCXR is an open, large-scale pediatric chest X-ray dataset for interpretation of common thoracic diseases in children. The dataset contains 9,125 CXR scans retrospectively collected from a major pediatric hospital in Vietnam between 2020 and 2021. Each scan was manually annotated by a pediatric radiologist who has more than ten years of experience. The dataset was labeled for the presence of 36 critical findings and 15 diseases. It aims to aid research in the detection of multiple findings and diseases.

1 papers0 benchmarksImages, Medical

Niramai Oncho Dataset (Niramai Onchocerciasis/RiverBlindness Dataset)

Onchocerciasis is causing blindness in over half a million people in the world today. Drug development for the disease is crippled as there is no way of measuring effectiveness of the drug without an invasive procedure. Drug efficacy measurement through assessment of viability of onchocerca worms requires the patients to undergo nodulectomy which is invasive, expensive, time-consuming, skill-dependent, infrastructure dependent and lengthy process.

1 papers0 benchmarksImages, Medical, Videos

CPSC2019 (The 2nd China Physiological Signal Challenge (CPSC 2019))

Introduction The China Physiological Signal Challenge 2019 (CPSC 2019) aims to encourage the development of algorithms for challenging QRS detection and heart rate (HR) estimation from short-term single-lead ECG recordings usually with low signal quality and/or abnormal rhythm waveforms.

1 papers0 benchmarksMedical

CPSC2020 (The 3rd China Physiological Signal Challenge 2020)

Introduction Abnormality of cardiac conduction system can induce arrhythmia. Abnormal heart rhythm can lead to other cardiac diseases and complications, and can be life-threatening 1. There are various types of arrhythmias and each type is associated with a pattern, and as such, it is possible to be identified. Arrhythmias can be classified into two major categories. The first category consists of arrhythmias formed by a single irregular heartbeat in electrocardiogram (ECG), herein called morphological arrhythmia, while another category consists of arrhythmias formed by a set of irregular heartbeats in ECG, herein called rhythmic arrhythmias 2. Dynamic electrocardiogram (DCG), like ECG Holter, provides an important way to monitor the incidences of arrhythmias in daily life, facilitating the doctors to check a total number and distribution of arrhythmias in a long time and thus to provide the required therapy to prevent further problems. The 3rd China Physiological Signal Challenge 2020

1 papers0 benchmarksMedical

CPSC2021 (The 4th China Physiological Signal Challenge 2021)

Introduction The 4th China Physiological Signal Challenge 2021 (CPSC 2021) aims to encourage the development of algorithms for searching the paroxysmal atrial fibrillation (PAF) events from dynamic ECG recordings.

1 papers0 benchmarksMedical

MODA dataset (Massive Online Data Annotation Spindle Dataset)

MODA is a large open-source dataset of high quality, human-scored sleep spindles (5342 spindles, from 180 subjects) that was produced by the Massive Online Data Annotation project. Sleep spindles were detected as a consensus of a number of human-expert scorers. With a median number of 5 experts scoring every EEG segment, MODA offers sleep spindle annotations of a quality unseen in previous datasets.

1 papers3 benchmarksEEG, Medical

BreastDICOM4 ([MIMBCD-UI] UTA4: Medical Imaging DICOM Files Dataset)

Several datasets are fostering innovation in higher-level functions for everyone, everywhere. By providing this repository, we hope to encourage the research community to focus on hard problems. In this repository, we present our medical imaging DICOM files of patients from our User Tests and Analysis 4 (UTA4) study. Here, we provide a dataset of the used medical images during the UTA4 tasks. This repository and respective dataset should be paired with the dataset-uta4-rates repository dataset. Work and results are published on a top Human-Computer Interaction (HCI) conference named AVI 2020 (page). Results were analyzed and interpreted on our Statistical Analysis charts. The user tests were made in clinical institutions, where clinicians diagnose several patients for a Single-Modality vs Multi-Modality comparison. For example, in these tests, we used both prototype-single-modality and prototype-multi-modality repositories for the comparison. On the same hand, the hereby dataset repres

1 papers2 benchmarksBiomedical, MRI, Medical
PreviousPage 15 of 20Next