TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Datasets

19,997 machine learning datasets

Filter by Modality

  • Images3,275
  • Texts3,148
  • Videos1,019
  • Audio486
  • Medical395
  • 3D383
  • Time series298
  • Graphs285
  • Tabular271
  • Speech199
  • RGB-D192
  • Environment148
  • Point cloud135
  • Biomedical123
  • LiDAR95
  • RGB Video87
  • Tracking78
  • Biology71
  • Actions68
  • 3d meshes65
  • Tables52
  • Music48
  • EEG45
  • Hyperspectral images45
  • Stereo44
  • MRI39
  • Physics32
  • Interactive29
  • Dialog25
  • Midi22
  • 6D17
  • Replay data11
  • Financial10
  • Ranking10
  • Cad9
  • fMRI7
  • Parallel6
  • Lyrics2
  • PSG2

19,997 dataset results

CochlScene

CochlScene is a dataset for acoustic scene classification. The dataset consists of 76k samples collected from 831 participants in 13 acoustic scenes.

7 papers1 benchmarksAudio

Demetr

Demetr is a diagnostic dataset with 31K English examples (translated from 10 source languages) for evaluating the sensitivity of MT evaluation metrics to 35 different linguistic perturbations spanning semantic, syntactic, and morphological error categories.

7 papers0 benchmarksTexts

KITTI360-EX

KITTI360-EX is a dataset for outer- and inner FoV expansion. It contains 76k pinhole images as well as 76k spherical images and is used for beyond-FoV estimation.

7 papers1 benchmarksImages

HyperRED (Hyper-Relational Extraction Dataset)

HyperRED is a dataset for the new task of hyper-relational extraction, which extracts relation triplets together with qualifier information such as time, quantity or location. For example, the relation triplet (Leonard Parker, Educated At, Harvard University) can be factually enriched by including the qualifier (End Time, 1967). HyperRED contains 44k sentences with 62 relation types and 44 qualifier types.

7 papers1 benchmarksTexts

MSU BASED (MSU BASED Video Deblurring Dataset and Benchmark)

Qualitative dataset with real blurred videos, created by using beam-splitter setup in lab environment

7 papers24 benchmarksVideos

CEFR-SP

CEFR-SP contains 17k English sentences annotated with the levels based on the Common European Framework of Reference for Languages assigned by English-education professionals.

7 papers0 benchmarksTexts

DPM (Don’t Patronize Me!)

Don’t Patronize Me! (DPM) is an annotated dataset with Patronizing and Condescending Language towards vulnerable communities.

7 papers7 benchmarks

FPv1

FPv1 (prior name FAUST-partial) is a 3D registration benchmark dataset created to address the lack of data variability in the existing 3D registration benchmarks such as: 3DMatch, ETH, KITTI.

7 papers6 benchmarks3D, Point cloud

LEVEN (Legal Event Detection Dataset)

Overview LEVEN is the largest Legal Event Detection dataset as well as the largest Chinese Event Detection dataset.

7 papers0 benchmarksTexts

MMBody

The MMBody dataset provides human body data with motion capture, GT mesh, Kinect RGBD, and millimeter wave sensor data. See homepage for more details.

7 papers0 benchmarks3D, 3d meshes, Images, Point cloud, RGB-D

HBW (Human Bodies in the Wild)

Human Bodies in the Wild (HBW) is a validation and test set for body shape estimation. It consists of images taken in the wild and ground truth 3D body scans in SMPL-X topology. To create HBW, we collect body scans of 35 participants and register the SMPL-X model to the scans. Further each participant is photographed in various outfits and poses in front of a white background and uploads full-body photos of themselves taken in the wild. The validation and test set images are released. The ground truth shape is only released for the validation set.

7 papers0 benchmarks3D, Images, Texts

JGLUE

JGLUE, Japanese General Language Understanding Evaluation, is built to measure the general NLU ability in Japanese.

7 papers0 benchmarks

USPTO-190

A chemical synthesis route dataset constructed from the USPTO reaction dataset (1976-Sep2016) and a list of commercially available building blocks from eMolecules (~23.1M molecules). After processing, the dataset has 299202 training routes, 65274 validation routes, 190 test routes, and the corresponding target molecules.

7 papers2 benchmarks

GeoDE

GeoDE is a geographically diverse dataset with 61,940 images from 40 classes and 6 world regions, and no personally identifiable information, collected through crowd-sourcing.

7 papers0 benchmarksImages

Aachen Day-Night v1.1 Benchmark

Aachen Day-Night v1.1 dataset is an extended version of the original Aachen Day-Night dataset. Besides the original query images, the Aachen Day-Night v1.1 dataset contains an additional 93 nighttime queries. In addition, it uses a larger 3D model containing additional images. These additional images were extracted from video sequences captured with different cameras. Please refer to Reference Pose Generation for Long-term Visual Localization via Learned Features and View Synthesis for more information.

7 papers3 benchmarksImages

OMMO

OMMO is a new benchmark for several outdoor NeRF-based tasks, such as novel view synthesis, surface reconstruction, and multi-modal NeRF. It contains complex objects and scenes with calibrated images, point clouds and prompt annotations.

7 papers0 benchmarksImages, Point cloud

BANDON

BANDON is a dataset for building change detection with off-nadir aerial images dataset, which is composed of off-Nadir image pairs of urban and rural areas. Overall, the BANDON dataset contains 2283 pairs of images, 2283 change labels,1891 BT-flows labels, 1891 pairs of segmentation labels, and 1891 pair of ST-offsets labels (test sets do not provide auxiliary annotations).

7 papers0 benchmarksImages

GHOSTS

GHOSTS is the first natural-language dataset made and curated by working researchers in mathematics that (1) aims to cover graduate-level mathematics and (2) provides a holistic overview of the mathematical capabilities of language models. It a collection of multiple datasets of prompts, totalling 728 prompts, for which ChatGPT’s output was manually rated by experts.

7 papers0 benchmarksTexts

TACRED-Revisited

The TACRED-Revisited dataset improves the crowd-sourced TACRED dataset for relation extraction by relabeling the dev and test sets using expert linguistic annotators. Relabeling focuses on the 5K most challenging instances in dev and test, in total, 51.2% of these are corrected. Published at ACL 2020.

7 papers1 benchmarksTexts

Weather2K

A multivariate spatio-temporal benchmark dataset for meteorological forecasting based on real-time observation data from ground weather stations.

7 papers0 benchmarksEnvironment, Time series
PreviousPage 189 of 1000Next