TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Datasets

395 machine learning datasets

Filter by Modality

  • Images3,275
  • Texts3,148
  • Videos1,019
  • Audio486
  • Medical395
  • 3D383
  • Time series298
  • Graphs285
  • Tabular271
  • Speech199
  • RGB-D192
  • Environment148
  • Point cloud135
  • Biomedical123
  • LiDAR95
  • RGB Video87
  • Tracking78
  • Biology71
  • Actions68
  • 3d meshes65
  • Tables52
  • Music48
  • EEG45
  • Hyperspectral images45
  • Stereo44
  • MRI39
  • Physics32
  • Interactive29
  • Dialog25
  • Midi22
  • 6D17
  • Replay data11
  • Financial10
  • Ranking10
  • Cad9
  • fMRI7
  • Parallel6
  • Lyrics2
  • PSG2
Clear filter

395 dataset results

MediBeng (Synthetic Code-Switched Bengali-English Speech Conversations for Healthcare Applications)

MediBeng Dataset The MediBeng dataset contains synthetic code-switched dialogues in Bengali and English for training models in speech recognition (ASR), text-to-speech (TTS), and machine translation in clinical settings. The dataset is available under the CC-BY-4.0 license.

1 papers1 benchmarksAudio, Medical, Speech, Texts

PreRAID (Prescreening Rheumatoid Arthritis Information Database (PreRAID))

PreRAID is a structured dataset designed to evaluate the diagnostic capabilities of Large Language Models (LLMs) in Rheumatoid Arthritis (RA) diagnosis. This dataset provides real-world patient data, offering insights into RA prediction and reasoning accuracy.

1 papers0 benchmarksMedical, Tabular, Texts

Aneux

The AneuX morphology database includes data from 3 different data sources: AneuX, @neurIST and Aneurisk. The AneuX data consists of two portions AneuX1 and AneuX2, which have extracted by two different teams of data curators.

1 papers0 benchmarks3d meshes, Images, Medical

JSRT (negative formats)

We processed 241 pairs of CXR and DES soft tissue images from the JSRT dataset by performing operations like inversion and contrast adjustment to convert these images into negative formats more frequently used in clinical settings.

1 papers0 benchmarksMedical

U2-Bench

U2-BENCH is the first large-scale benchmark for evaluating Large Vision-Language Models (LVLMs) on ultrasound imaging understanding. It provides a diverse, multi-task dataset curated from 40 licensed sources, covering 15 anatomical regions and 8 clinically inspired tasks across classification, detection, regression, and text generation.

1 papers0 benchmarksImages, Medical

[[FAQs~Communication]]How Do I Communicate to Expedia?

To Communicate with Expedia customer service, you can call 1-800-Expedia (1-888-829-0881). You can also contact Expedia customer service via their customer support page and virtual agent.1-888-829-0881 If you need to cancel a flight, you can do so on Expedia up to 24 hours before the scheduled departure time 1-888-829-0881. If you cancel within this time frame, you will receive a full refund of your flight booking, including any fees for seats or bags. You can call Expedia at 1-888-829-0881 to cancel your flight. For more information on Expedia Corporate Travel, you can call 1-888-829-0881.

1 papers0 benchmarksMedical

NeuB1

NeuB1 is a microscopic neuronal image dataset for retinal vessel segmentation, which contains 112 images of size 512 x 152. The train/test split is 37/75. Image Source: https://web.bii.a-star.edu.sg/~zhaoh/Jaydeep_Tracing/

0 papers0 benchmarksImages, Medical

DR HAGIS (Diabetic Retinopathy, Hypertension, Age-related macular degeneration and Glacuoma ImageS)

The DR HAGIS database has been created to aid the development of vessel extraction algorithms suitable for retinal screening programmes. Researchers are encouraged to test their segmentation algorithms using this database.

0 papers0 benchmarksImages, Medical

VICAVR (VICAVR Database)

The VICAVR database is a set of retinal images used for the computation of the A/V Ratio. The database currently includes 58 images. The images have been acquired with a TopCon non-mydriatic camera NW-100 model and are optic disc centered with a resolution of 768x584. The database includes the caliber of the vessels measured at different radii from the optic disc as well as the vessel type (artery/vein) labelled by three experts.

0 papers0 benchmarksImages, Medical

DIARETDB1

The database consists of 89 colour fundus images of which 84 contain at least mild non-proliferative signs (Microaneurysms) of the diabetic retinopathy, and 5 are considered as normal which do not contain any signs of the diabetic retinopathy according to all experts who participated in the evaluation. Images were captured using the same 50 degree field-of-view digital fundus camera with varying imaging settings. The data correspond to a good (not necessarily typical) practical situation, where the images are comparable, and can be used to evaluate the general performance of diagnostic methods. This data set is referred to as "calibration level 1 fundus images".

0 papers0 benchmarksImages, Medical

EMIDEC (MICCAI 2020 EMIDEC)

The MICCAI 2020 EMIDEC dataset is a dataset for classifying normal and pathological cases from the clinical information with or without DE-MRI, and secondly to automatically detect the different relevant areas (the myocardial contours, the infarcted area and the permanent microvascular obstruction area (no-reflow area)) from a series of short-axis DE-MRI covering the left ventricle. The segmentation allows one to make a quantification of the MI, in absolute value (mm3) or percentage of the myocardium.

0 papers0 benchmarksImages, Medical

HeartSeg

The medaka (Oryzias latipes) and the zebrafish (Danio rerio) are used as a model organism for a variety of subjects in biomedical research. The presented work aims to study the potential of automated ventricular dimension estimation through heart segmentation in medaka. For more on this, it's time for a closer look on our paper and the supplementary materials.

0 papers0 benchmarksBiology, Biomedical, Images, Medical, Time series, Videos

RSPECT (The RSNA Pulmonary Embolism CT)

The RSNA Pulmonary Embolism CT (RSPECT) Dataset is composed of CT pulmonary angiogram images and annotations related to pulmonary embolism. It's part of the 2020 RSNA Pulmonary Embolism Detection Challenge which invited researchers to develop machine-learning algorithms to detect and characterize instances of pulmonary embolism (PE) on chest CT studies. The competition, conducted in collaboration with the Society of Thoracic Radiology (STR), involved creating the largest publicly available annotated PE dataset, comprised of more than 12,000 CT studies. Imaging data was contributed by five international research centers and labeled with detailed clinical annotations by a group of more than 80 expert thoracic radiologists. For the first time in an RSNA data challenge, the rules required competitors to submit and run their code in a standard shared environment, producing simpler, more readily usable models.

0 papers0 benchmarksBiomedical, Images, Medical

Toronto NeuroFace Dataset

Toronto NeuroFace Dataset: A New Dataset for Facial Motion Analysis in Individuals with Neurological Disorders

0 papers0 benchmarksImages, Medical, RGB-D, Videos

FALLMUD (FAscicle Lower Leg Muscle Ultrasound Dataset)

FAscicle Lower Leg Muscle Ultrasound Dataset is a dataset composed of 812 ultrasound images of lower leg muscles to analyze muscle weaknesses and prevent injuries. It combines the datasets provided by two articles, “Estimating Full Regional Skeletal Muscle Fibre Orientation from B-Mode Ultrasound Images Using Convolutional, Residual, and Deconvolutional Neural Networks” published by Ryan Cunningham et al. and “Automated Analysis of Musculoskeletal Ultrasound Images Using Deep Learning” published by Neil Cronin, with complementary annotations. The dataset has been introduced in this paper: Michard, H., Luvison, B., Pham, Q. C., Morales-Artacho, A. J., & Guilhem, G. (2021, August). AW-Net: automatic muscle structure analysis on B-mode ultrasound images for injury prevention. In Proceedings of the 12th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics (pp. 1-9).

0 papers0 benchmarksImages, Medical

Visual Fields (UWHVF: A real-world, open source dataset of Humphrey Visual Fields (HVF) from the University of Washington)

28,943 Humphrey Visual Field (HVF) tests from 3,871 patients and 7,428 eyes.

0 papers0 benchmarksImages, Medical

CBCT Walnut (Cone-Beam X-Ray CT Data Collection Designed for Machine Learning)

The scans are performed using a custom-built, highly flexible X-ray CT scanner, the FleX-ray scanner, developed by XRE nvand located in the FleX-ray Lab at the Centrum Wiskunde & Informatica (CWI) in Amsterdam, Netherlands. The general purpose of the FleX-ray Lab is to conduct proof of concept experiments directly accessible to researchers in the field of mathematics and computer science. The scanner consists of a cone-beam microfocus X-ray point source that projects polychromatic X-rays onto a 1536-by-1944 pixels, 14-bit flat panel detector (Dexella 1512NDT) and a rotation stage in-between, upon which a sample is mounted. All three components are mounted on translation stages which allow them to move independently from one another.

0 papers0 benchmarks3D, Medical

PAXRay (PAXRay: A Projected dataset for the segmentation of Anatomical structures in X-Ray data)

Projection of RibFrac CT dataset to a 2D plane to imitate X-Ray data for a total of 880 images with multi-label segmentation masks. The dataset contains fine-grained 92 individual labels of anatomical structures, which, when including super-classes, lead to a total of 166 labels in both lateral and frontal view.

0 papers0 benchmarksMedical

FHRMA dataset for FS detection (FHRMA dataset for fetal heart rate false signal detection)

FHRMA is an open-source project for Fetal Heart Rate Morphological Analysis containing Matlab source code and datasets. As a sub-project, it includes a deep learning method and dataset for automatic identification of the maternal heart rate (MHR) and, more generally, false signals (FSs) on fetal heart rate (FHR) recordings. The challenge concerns particularly the FHR signal recorded with Doppler sensors, on which MHR interference and other FSs are particularly common, but the dataset also includes FHR recorded with scalp-ECG. The training and validation dataset contained 1030 expert-annotated periods (mean duration: 36 min) from 635 recordings. Labels consist of annotating each time sample as either 1: False signal; 0: True signal, or -1: do not know or irrelevant. 

0 papers0 benchmarksBiomedical, Medical, Time series

EyePACS-light (v1) (EyePACS-AIROGS-light-v1)

This is a machine-learning-ready glaucoma dataset using a balanced subset of standardized fundus images from the Rotterdam EyePACS AIROGS train set. This dataset is split into training, validation, and test folders which contain 2500, 270, and 500 fundus images in each class respectively. Each training set has a folder for each class: referable glaucoma (RG) and non-referable glaucoma (NRG).

0 papers0 benchmarksImages, Medical
PreviousPage 19 of 20Next