TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Datasets

19,997 machine learning datasets

Filter by Modality

  • Images3,275
  • Texts3,148
  • Videos1,019
  • Audio486
  • Medical395
  • 3D383
  • Time series298
  • Graphs285
  • Tabular271
  • Speech199
  • RGB-D192
  • Environment148
  • Point cloud135
  • Biomedical123
  • LiDAR95
  • RGB Video87
  • Tracking78
  • Biology71
  • Actions68
  • 3d meshes65
  • Tables52
  • Music48
  • EEG45
  • Hyperspectral images45
  • Stereo44
  • MRI39
  • Physics32
  • Interactive29
  • Dialog25
  • Midi22
  • 6D17
  • Replay data11
  • Financial10
  • Ranking10
  • Cad9
  • fMRI7
  • Parallel6
  • Lyrics2
  • PSG2

19,997 dataset results

AIM-500 (Automatic Image Matting-500)

AIM-500 is the first natural image matting test set, contains 500 high-resolution real-world natural images from three types of images (salient opaque foregrounds, salient transparent/meticulous foregrounds, non-salient foregrounds), and multiple categories. The amount of each category is shown in the following table.

23 papers5 benchmarks

EMOPIA (A Multi-Modal Pop Piano Dataset For Emotion Recognition and Emotion-based Music Generation)

EMOPIA (pronounced ‘yee-mò-pi-uh’) dataset is a shared multi-modal (audio and MIDI) database focusing on perceived emotion in pop piano music, to facilitate research on various tasks related to music emotion. The dataset contains 1,087 music clips from 387 songs and clip-level emotion labels annotated by four dedicated annotators.

23 papers0 benchmarksAudio, Midi

VisEvent

VisEvent (Visible-Event benchmark) is a dataset constructed for the evaluation of tracking by combing visible and event cameras. VisEvent is featured in:

23 papers1 benchmarks

mMARCO

mMARCO is a multilingual version of the MS MARCO passage ranking dataset comprising 8 languages that was created using machine translation.

23 papers0 benchmarksTexts

HPS (Human POSEitioning System Dataset)

HPS Dataset is a collection of 3D humans interacting with large 3D scenes (300-1000 $m^2$, up to 2500 $m^2$). The dataset contains images captured from a head-mounted camera coupled with the reference 3D pose and location of the person in a pre-scanned 3D scene. 7 people in 8 large scenes are captured performing activities such as exercising, reading, eating, lecturing, using a computer, making coffee, dancing. The dataset provides more than 300K synchronized RGB images coupled with the reference 3D pose and location.

23 papers0 benchmarks3D, Images, Point cloud

SituatedQA

SituatedQA is an open-retrieval QA dataset where systems must produce the correct answer to a question given the temporal or geographical context. Answers to the same question may change depending on the extralinguistic contexts (when and where the question was asked).

23 papers0 benchmarksTexts

OpenImages-v6

OpenImages V6 is a large-scale dataset , consists of 9 million training images, 41,620 validation samples, and 125,456 test samples. It is a partially annotated dataset, with 9,600 trainable classes

23 papers6 benchmarksImages

ATOM3D

ATOM3D is a unified collection of datasets concerning the three-dimensional structure of biomolecules, including proteins, small molecules, and nucleic acids. These datasets are specifically designed to provide a benchmark for machine learning methods which operate on 3D molecular structure, and represent a variety of important structural, functional, and engineering tasks. All datasets are provided in a standardized format along with a Python package containing processing code, utilities, models, and dataloaders for common machine learning frameworks such as PyTorch. ATOM3D is designed to be a living database, where datasets are updated and tasks are added as the field progresses.

23 papers0 benchmarksBiomedical

ELD (Extreme Low-light Denoising dataset)

Extreme low-light denoising (ELD) dataset that covers 10 indoor scenes and 4 camera devices from multiple brands (SonyA7S2, NikonD850, CanonEOS70D, CanonEOS700D). It has three levels (800, 1600, 3200) and two low light factors(100, 200) for noisy images, resulting in 240 (3×2×10×4) raw image pairs in total.

23 papers0 benchmarks

HRSOD (High-Resolution Salient Object Detection)

There exist several datasets for saliency detection, but none of them is specifically designed for high-resolution salient object detection. High-Resolution Salient Object Detection (HRSOD) dataset, containing 1610 training images and 400 test images. The total 2010 images are collected from the website of Flickr with the license of all creative commons. Pixel-level ground truths are manually annotated by 40 subjects. The shortest edge of each image in HRSOD is more than 1200 pixels.

23 papers24 benchmarksImages

HOI4D

A large-scale 4D egocentric dataset with rich annotations, to catalyze the research of category-level human-object interaction. HOI4D consists of 2.4M RGB-D egOCentric video frames over 4000 sequences collected by 4 participants interacting with 800 different object instances from 16 categories over 610 different indoor rooms.

23 papers0 benchmarksVideos

ARCTIC (Articulated Objects in Free-form Hand Interaction)

ARCTIC is a dataset of free-form interactions of hands and articulated objects. ARCTIC has 1.2M images paired with accurate 3D meshes for both hands and for objects that move and deform over time. The dataset also provides hand-object contact information.

23 papers0 benchmarks3d meshes, Images

PointCloud-C

PointCloud-C is the very first test-suite for point cloud robustness analysis under corruptions.

23 papers2 benchmarks3D

Weibo

This dataset is from DeepHawkes: Bridging the Gap between Prediction and Understanding of Information Cascades, CIKM 2017. It includes Weibo tweets and their retweets posted in a day.

23 papers0 benchmarksTime series

K-Radar (KAIST-Radar)

KAIST-Radar (K-Radar) is a novel large-scale object detection dataset and benchmark that contains 35K frames of 4D Radar tensor (4DRT) data with power measurements along the Doppler, range, azimuth, and elevation dimensions, together with carefully annotated 3D bounding box labels of objects on the roads. K-Radar includes challenging driving conditions such as adverse weathers (fog, rain, and snow) on various road structures (urban, suburban roads, alleyways, and highways). In addition to the 4DRT, we provide auxiliary measurements from carefully calibrated high-resolution Lidars, surround stereo cameras, and RTK-GPS.

23 papers0 benchmarks

RiddleSense

Question: I have five fingers but I am not alive. What am I? Answer: a glove.

23 papers1 benchmarks

ImageNet-W (ImageNet-Watermark)

ImageNet-W(atermark) is a test set to evaluate models’ reliance on the newly found watermark shortcut in ImageNet, which is used to predict the carton class. ImageNet-W is created by overlaying transparent watermarks on the ImageNet validation set. Two metrics are used to evaluate watermark shortcut reliance: (1) IN-W Gap: the top-1 accuracy drop from ImageNet to ImageNet-W, (2) Carton Gap: carton class accuracy increase from ImageNet to ImageNet-W. Combining ImageNet-W with previous out-of-distribution variants of ImageNet (e.g., Stylized ImageNet, ImageNet-R, ImageNet-9) forms a comprehensive suite of multi-shortcut evaluation on ImageNet.

23 papers0 benchmarksImages

DigestPath

Introduced by Da et al. in DigestPath: a Benchmark Dataset with Challenge Review for the Pathological Detection and Segmentation of Digestive-System

23 papers2 benchmarksImages, Medical

MSRA CN NER (MSRA CN NER Dataset)

Simplified Chinese dataset for NER in The Third International Chinese Language Processing Bakeoff (2006), provided by Microsoft Research Asia (MSRA).

23 papers0 benchmarksTexts

WeatherBench 2

WeatherBench 2 is an update to the global, medium-range (1–14 day) weather forecasting benchmark proposed by rasp_weatherbench_2020, designed with the aim to accelerate progress in data-driven weather modeling. WeatherBench 2 consists of an open-source evaluation framework, publicly available training, ground truth and baseline data as well as a continuously updated website with the latest metrics and state-of-the-art models.

23 papers0 benchmarksTime series
PreviousPage 95 of 1000Next