TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Datasets

19,997 machine learning datasets

Filter by Modality

  • Images3,275
  • Texts3,148
  • Videos1,019
  • Audio486
  • Medical395
  • 3D383
  • Time series298
  • Graphs285
  • Tabular271
  • Speech199
  • RGB-D192
  • Environment148
  • Point cloud135
  • Biomedical123
  • LiDAR95
  • RGB Video87
  • Tracking78
  • Biology71
  • Actions68
  • 3d meshes65
  • Tables52
  • Music48
  • EEG45
  • Hyperspectral images45
  • Stereo44
  • MRI39
  • Physics32
  • Interactive29
  • Dialog25
  • Midi22
  • 6D17
  • Replay data11
  • Financial10
  • Ranking10
  • Cad9
  • fMRI7
  • Parallel6
  • Lyrics2
  • PSG2

19,997 dataset results

DivEMT (Post-Editing Effort Across Typologically-diverse Languages)

DivEMT, the first publicly available post-editing study of Neural Machine Translation (NMT) over a typologically diverse set of target languages. Using a strictly controlled setup, 18 professional translators were instructed to translate or post-edit the same set of English documents into Arabic, Dutch, Italian, Turkish, Ukrainian, and Vietnamese. During the process, their edits, keystrokes, editing times and pauses were recorded, enabling an in-depth, cross-lingual evaluation of NMT quality and post-editing effectiveness. Using this new dataset, we assess the impact of two state-of-the-art NMT systems, Google Translate and the multilingual mBART-50 model, on translation productivity.

3 papers0 benchmarksTexts, Tracking

WorldView-2 PairMax

This dataset refers to the two images acquired by the WorldView-2 satellite, representing Miami.

3 papers4 benchmarks

GeoEye-1 PairMax

This dataset refers to the two images acquired by the GeoEye-1 satellite, representing London and Trenton, respectively.

3 papers4 benchmarks

VA (Virtual Apartment)

A synthetic depth estimation dataset for benchmark rendered from a high-quality CAD indoor environment

3 papers8 benchmarksRGB-D, Stereo

PubChemQA

PubChemQA consists of molecules and their corresponding textual descriptions from PubChem. It contains a single type of question, i.e., please describe the molecule. We remove molecules that cannot be processed by RDKit [Landrum et al., 2021] to generate 2D molecular graphs. We also remove texts with less than 4 words, and crops descriptions with more than 256 words. Finally, we obtain 325, 754 unique molecules and 365, 129 molecule-text pairs. On average, each text description contains 17 words.

3 papers6 benchmarks

WebVid-CoVR

The WebVid-CoVR dataset is a collection of video-text-video triplets that can be used for the task of composed video retrieval (CoVR). CoVR is a task that involves searching for videos that match both a query image and a query text. The text typically specifies the desired modification to the query image.

3 papers2 benchmarksImages, Texts, Videos

RemFX (RemFX Evaluation Datasets)

Audio samples processed with sound effects, to evaluate effect removal models. The audio effects applied are from the set (Distortion, Delay, Dynamic Range Compressor, Phasor, Reverb) and randomly sampled without replacement for each example; the targets are the original audio.

3 papers0 benchmarksAudio

Sound-Dr (Sound-Dr: Reliable Sound Dataset and Baseline Artificial Intelligence System for Respiratory Illnesses)

As the burden of respiratory diseases continues to fall on society worldwide, this paper proposes a high-quality and reliable dataset of human sounds for studying respiratory illnesses, including pneumonia and COVID-19. It consists of coughing, mouth breathing, and nose breathing sounds together with metadata on related clinical characteristics. We also develop a proof-of-concept system for establishing baselines and benchmarking against multiple datasets, such as Coswara and COUGHVID. Our comprehensive experiments show that the Sound-Dr dataset has richer features, better performance, and is more robust to dataset shifts in various machine learning tasks. It is promising for a wide range of real-time applications on mobile devices. The proposed dataset and system will serve as practical tools to support healthcare professionals in diagnosing respiratory disorders. The dataset and code are publicly available here: https://github.com/ReML-AI/Sound-Dr/.

3 papers0 benchmarks

maze-dataset

This package provides utilities for generation, filtering, solving, visualizing, and processing of mazes for training ML systems. Primarily built for the maze-transformer interpretability project. You can find our paper on it here: http://arxiv.org/abs/2309.10498

3 papers0 benchmarksActions, Environment, Graphs, Images

EgoPAT3D-DT

Click to add a brief description of the dataset (Markdown and LaTeX enabled).

3 papers0 benchmarks3D, Videos

Br35H :: Brain Tumor Detection 2020

✔️Abstract A Brain tumor is considered as one of the aggressive diseases, among children and adults. Brain tumors account for 85 to 90 percent of all primary Central Nervous System (CNS) tumors. Every year, around 11,700 people are diagnosed with a brain tumor. The 5-year survival rate for people with a cancerous brain or CNS tumor is approximately 34 percent for men and36 percent for women. Brain Tumors are classified as: Benign Tumor, Malignant Tumor, Pituitary Tumor, etc. Proper treatment, planning, and accurate diagnostics should be implemented to improve the life expectancy of the patients. The best technique to detect brain tumors is Magnetic Resonance Imaging (MRI). A huge amount of image data is generated through the scans. These images are examined by the radiologist. A manual examination can be error-prone due to the level of complexities involved in brain tumors and their properties. Application of automated classification techniques using Machine Learning (ML) and Artificia

3 papers0 benchmarksImages, Medical

GlotScript (GlotScript Resource)

GlotScript-R is a resource that provides the attested writing systems for more than 7,000 languages.

3 papers0 benchmarksTexts

SEER

Click to add a brief description of the dataset (Markdown and LaTeX enabled).

3 papers0 benchmarks

LLeQA (Long-form Legal Question Answering)

LLeQA is a French native dataset for studying information retrieval and long-form question answering in the legal domain. It consists of a knowledge corpus of 27,941 statutory articles collected from the Belgian legislation, and 1,868 legal questions posed by Belgian citizens and labeled by experienced jurists with a comprehensive answer rooted in relevant articles from the corpus.

3 papers0 benchmarksTexts

LSA16 (Lengua de Señas Argentina - 16 Handshapes classes)

This database contains images of 16 handshapes of the Argentinian Sign Language (LSA), each performed 5 times by 10 different subjects, for a total of 800 images. The subjects wore color hand gloves and dark clothes.

3 papers2 benchmarksImages

RWTH-PHOENIX Handshapes dev set (RWTH-PHOENIX-Weather 2014 MS Handshapes dev set)

We manually labelled 3359 images from the RWTH-PHOENIX-Weather 2014 Development set.

3 papers2 benchmarksImages

PDFVQA

PDFVQA: A New Dataset for Real-World VQA on PDF Documents

3 papers0 benchmarksImages, Texts

BLEFF (Blender Forward Facing Dataset)

Synthetic (Blender) Dataset for forward facing scenes

3 papers1 benchmarks

Multi-Atlas Labeling Beyond the Cranial Vault

Click to add a brief description of the dataset (Markdown and LaTeX enabled).

3 papers0 benchmarksImages

GroOT (Grounded Multiple Object Tracking)

One of the recent trends in vision problems is to use natural language captions to describe the objects of interest. This approach can overcome some limitations of traditional methods that rely on bounding boxes or category annotations. This paper introduces a novel paradigm for Multiple Object Tracking called Type-to-Track, which allows users to track objects in videos by typing natural language descriptions. We present a new dataset for that Grounded Multiple Object Tracking task, called GroOT, that contains videos with various types of objects and their corresponding textual captions of 256K words describing their appearance and action in detail. To cover a diverse range of scenes, GroOT was created using official videos and bounding box annotations from the MOT17, TAO and MOT20.

3 papers0 benchmarksTexts, Tracking, Videos
PreviousPage 286 of 1000Next