TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Datasets

395 machine learning datasets

Filter by Modality

  • Images3,275
  • Texts3,148
  • Videos1,019
  • Audio486
  • Medical395
  • 3D383
  • Time series298
  • Graphs285
  • Tabular271
  • Speech199
  • RGB-D192
  • Environment148
  • Point cloud135
  • Biomedical123
  • LiDAR95
  • RGB Video87
  • Tracking78
  • Biology71
  • Actions68
  • 3d meshes65
  • Tables52
  • Music48
  • EEG45
  • Hyperspectral images45
  • Stereo44
  • MRI39
  • Physics32
  • Interactive29
  • Dialog25
  • Midi22
  • 6D17
  • Replay data11
  • Financial10
  • Ranking10
  • Cad9
  • fMRI7
  • Parallel6
  • Lyrics2
  • PSG2
Clear filter

395 dataset results

VGG Cell

The VGG Cell dataset (made up entirely of synthetic images) is the main public benchmark used to compare cell counting techniques.

1 papers0 benchmarksImages, Medical

NIH-LN (NIH-Lymph Node)

NIH-Lymph Node (NIH-LN) contains 388 mediastinal LNs in 90 CT scans and 595 abdominal LNs in 86 scans.

1 papers0 benchmarksImages, Medical

CLOUD (CLOUD Dataset)

The CLOUD dataset is a set of Optical Coherence Tomography of the Anterior Segment images (AS-OCT) used to the automatic identification and representation of the cornea-contact lens relationship. The dataset includes 112 AS-OCT images that were captured from 16 different patients. In particular, the images were obtained by an OCT Cirrus 500 scanner model of Carl Zeiss Meditec with an anterior segment module for users of scleral contact lens (SCL).

1 papers0 benchmarksImages, Medical

Cervix93 Cytology Dataset

The dataset has 93 image stacks and their corresponding Extended Depth of Field (EDF) image acquired from cases with grades Nagative, LSIL or HSIL (The Bethesda System): - Negative: 16 - LSIL: 46 - HSIL: 31 The ground truth includes the grade labels for each frame and manually marked points inside cervical cells in each frame. There are in total 2705 manually marked points inside all frames: - Negative: 238 - LSIL: 1536 - HSIL: 931

1 papers0 benchmarksImages, Medical

CPCXR (COVID-19 Posteroanterior Chest X-Ray fused)

The COVID-19 Posteroanterior Chest X-Ray fused (CPCXR) dataset is generated by the fusion of three publicly available datasets: COVID-19 cxr image, Radiological Society of North America (RSNA), and U.S. national library of medicine (USNLM) collected Montgomery country - NLM(MC). The dataset consists of samples of diseases labeled as COVID-19, Tuberculosis, Other pneumonia (SARS, MERS, etc.), and Normal. The dataset can be utilized to train an evaulate deep learning and machine learning models as binary and multi-class classification problem.

1 papers0 benchmarksImages, Medical

DLBCL-Morph

DLBCL-Morph is a dataset containing 42 digitally scanned high-resolution tissue microarray (TMA) slides accompanied by clinical, cytogenetic, and geometric features from 209 DLBCL cases.

1 papers0 benchmarksMedical

Medical Case Report Corpus

Medical Case Report Corpus is a new corpus comprising annotations of medical entities in case reports, originating from PubMed Central's open access library.

1 papers0 benchmarksMedical, Texts

PRECOG (PREdiction of Clinical Outcomes from Genomic Profiles)

The PREdiction of Clinical Outcomes from Genomic profiles (or PRECOG) encompasses 166 cancer expression data sets, including overall survival data for ~18,000 patients diagnosed with 39 distinct malignancies.

1 papers0 benchmarksMedical

SYSU-CEUS

The SYSU-CEUS dataset consists of three types of Focal liver lesions (FLLs): 186 HCC instances, 109 HEM instances and 58 FNH instances (i.e.,186 malignant instances and 167 benign instances). This dataset is collected from the First Affiliated Hospital, Sun Yat-sen University. The equipment used was Aplio SSA-770A (Toshiba Medical System). All these instances with resolution 768*576 were taken from different patients, with large variations in appearance and enhancement patterns (e.g. sizes, contrasts, shapes and locations) of the FLLs.

1 papers0 benchmarksMedical

PART-OF

The PART-OF dataset is a dataset of relations extracted from a medical ontology. The different entities in the ontology are parts of the human body. The dataset has 16,894 nodes with 19,436 edges between them.

1 papers0 benchmarksGraphs, Medical, Texts

Cuff-Less Blood Pressure Estimation (Cuff-Less Blood Pressure Estimation. Pre-processed and cleaned vital signals for cuff-less BP estimation.)

Data Set Information: The main goal of this data set is providing clean and valid signals for designing cuff-less blood pressure estimation algorithms. The raw electrocardiogram (ECG), photoplethysmograph (PPG), and arterial blood pressure (ABP) signals are originally collected from the physionet.org and then some preprocessing and validation performed on them. (For more information about the process please refer to our paper)

1 papers0 benchmarksMedical

Synthetic COVID-19 CXR Dataset

A public open dataset of synthetic chest X-ray images of COVID-19.

1 papers0 benchmarksBiomedical, Images, Medical

Multi-template MRI mouse brain atlas (Multi-template MRI mouse brain atlas for both in vivo and ex vivo analysis)

Mouse Brain MRI atlas (both in-vivo and ex-vivo) (repository relocated from the original webpage)

1 papers0 benchmarks3D, Biomedical, Images, MRI, Medical

CMeIE (Chinese Medical Information Extraction Dataset)

Chinese Medical Information Extraction, a dataset that is also released in CHIP2020, is used for CMeIE task. The task is aimed at identifying both entities and relations in a sentence following the schema constraints. There are 53 relations defined in the dataset, including 10 synonymous sub-relationships and 43 other sub-relationships.

1 papers1 benchmarksMedical, Texts

Synthetic COVID-19 Chest X-ray

The Synthetic COVID-19 Chest X-ray Dataset consists of 21,295 synthetic COVID-19 chest X-ray images to be used for computer-aided diagnosis. These images, generated via an unsupervised domain adaptation approach, are of high quality.

1 papers0 benchmarksMedical

synthetic_dataset.h5

The synethetic dataset (10000 pairs of images and region, 2.95GB) is shared with the code (hdf5 dataset format).

1 papers0 benchmarksImages, Medical

SinGAN-Seg-polyps

SinGAN-Seg-polyps is a synthetic dataset for polyp segmentation consisting of 10,000 synthetic polyps and masks.

1 papers0 benchmarksBiomedical, Images, Medical

HYPE (PPG and Blood Pressure from a Hypertensive Population)

HYPE Dataset - Version 1.0.0

1 papers0 benchmarksMedical

ImageTBAD

A dataset of A 3D Computed Tomography (CT) image dataset, ImageTBAD, for segmentation of Type-B Aortic Dissection is published. ImageTBAD contains 100 3D Computed Tomography (CT) images, which is of decent size compared with existing medical imaging datasets.

1 papers0 benchmarksMedical

Novel COVID-19 Chestxray Repository

Authors of the Dataset:

1 papers1 benchmarksImages, Medical
PreviousPage 14 of 20Next