TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Datasets

19,997 machine learning datasets

Filter by Modality

  • Images3,275
  • Texts3,148
  • Videos1,019
  • Audio486
  • Medical395
  • 3D383
  • Time series298
  • Graphs285
  • Tabular271
  • Speech199
  • RGB-D192
  • Environment148
  • Point cloud135
  • Biomedical123
  • LiDAR95
  • RGB Video87
  • Tracking78
  • Biology71
  • Actions68
  • 3d meshes65
  • Tables52
  • Music48
  • EEG45
  • Hyperspectral images45
  • Stereo44
  • MRI39
  • Physics32
  • Interactive29
  • Dialog25
  • Midi22
  • 6D17
  • Replay data11
  • Financial10
  • Ranking10
  • Cad9
  • fMRI7
  • Parallel6
  • Lyrics2
  • PSG2

19,997 dataset results

MMC4 (Multimodal C4)

Multimodal C4 (MMC4) is an augmentation of the popular text-only c4 corpus with images interleaved. The corpus contains 103M documents containing 585M images interleaved with 43B English tokens.

31 papers0 benchmarksImages, Texts

CAVE (Multispectral imaging using multiplexed illumination.)

Multispectral imaging using multiplexed illumination.

31 papers4 benchmarks

AGIQA-3K

The AGIQA-3K is a fine-grained AI-generated image (AGI) subjective quality assessment database. It was created to address the need for quality models that are consistent with human subjective ratings, considering the large quality variance among different AGIs. The database extensively considers various popular AGI models, generates AGI through different prompts and model parameters, and collects subjective scores at the perceptual quality and text-to-image alignment level.

31 papers0 benchmarks

Weather (Max-Planck-Institut Weather Dataset for Long-term Time Series Forecasting)

Weather is recorded every 10 minutes for the 2020 whole year, which contains 21 meteorological indicators, such as air temperature, humidity, etc. The dataset in CSV format can be downloaded at https://drive.google.com/file/d/1Tc7GeVN7DLEl-RAs-JVwG9yFMf--S8dy/view?usp=share_link.

31 papers8 benchmarks

Tox21

The Tox21 data set comprises 12,060 training samples and 647 test samples that represent chemical compounds. There are 801 "dense features" that represent chemical descriptors, such as molecular weight, solubility or surface area, and 272,776 "sparse features" that represent chemical substructures (ECFP10, DFS6, DFS8; stored in Matrix Market Format ). Machine learning methods can either use sparse or dense data or combine them. For each sample there are 12 binary labels that represent the outcome (active/inactive) of 12 different toxicological experiments. Note that the label matrix contains many missing values (NAs). The original data source and Tox21 challenge site is https://tripod.nih.gov/tox21/challenge/.

30 papers5 benchmarks

N-UCLA (Northwestern-UCLA Multiview Action 3D Dataset)

The Multiview 3D event dataset is capture by me and Xiaohan Nie in UCLA. it contains RGB, depth and human skeleton data captured simultaneously by three Kinect cameras. This dataset include 10 action categories: pick up with one hand, pick up with two hands, drop trash, walk around, sit down, stand up, donning, doffing, throw, carry. Each action is performed by 10 actors. This dataset contains data taken from a variety of viewpoints. The dataset can be found in part-1, part-2 part-3, part-4, part-5, part-6, part-7, part-8, part-9, part-10, part-11, part-12, part-13, part-14, part-15, part-16, We also created a version of the dataset that only contains RGB videos: RGB videos only.

30 papers20 benchmarks

FakeNewsNet

FakeNewsNet is collected from two fact-checking websites: GossipCop and PolitiFact containing news contents with labels annotated by professional journalists and experts, along with social context information.

30 papers0 benchmarksTexts

QuaRTz (QuaRTz Dataset)

QuaRTz is a crowdsourced dataset of 3864 multiple-choice questions about open domain qualitative relationships. Each question is paired with one of 405 different background sentences (sometimes short paragraphs).

30 papers0 benchmarksTexts

DeepFashion2

DeepFashion2 is a versatile benchmark of four tasks including clothes detection, pose estimation, segmentation, and retrieval. It has 801K clothing items where each item has rich annotations such as style, scale, viewpoint, occlusion, bounding box, dense landmarks and masks. There are also 873K Commercial-Consumer clothes pairs

30 papers0 benchmarksImages

General-100

The General-100 dataset is a dataset for image super-resolution. It contains 100 bmp format images with no compression) The size of the 100 images ranges from 710 x 704 (large) to 131 x 112 (small).

30 papers0 benchmarksImages

HELP

The HELP dataset is an automatically created natural language inference (NLI) dataset that embodies the combination of lexical and logical inferences focusing on monotonicity (i.e., phrase replacement-based reasoning). The HELP (Ver.1.0) has 36K inference pairs consisting of upward monotone, downward monotone, non-monotone, conjunction, and disjunction.

30 papers0 benchmarksTexts

HUMBI

A new large multiview dataset for human body expressions with natural clothing. The goal of HUMBI is to facilitate modeling view-specific appearance and geometry of gaze, face, hand, body, and garment from assorted people. 107 synchronized HD cameras are used to capture 772 distinctive subjects across gender, ethnicity, age, and physical condition.

30 papers0 benchmarks

InteriorNet

InteriorNet is a RGB-D for large scale interior scene understanding and mapping. The dataset contains 20M images created by pipeline:

30 papers0 benchmarks3D, Images, RGB-D

ShoeV2

ShoeV2 is a dataset of 2,000 photos and 6648 sketches of shoes. The dataset is designed for fine-grained sketch-based image retrieval.

30 papers0 benchmarksImages

StreetLearn

An interactive, first-person, partially-observed visual environment that uses Google Street View for its photographic content and broad coverage, and give performance baselines for a challenging goal-driven navigation task.

30 papers0 benchmarks

Wiki-40B

A new multilingual language model benchmark that is composed of 40+ languages spanning several scripts and linguistic families containing round 40 billion characters and aimed to accelerate the research of multilingual modeling.

30 papers5 benchmarks

Holl-E

Holl-E is a dataset containing movie chats wherein each response is explicitly generated by copying and/or modifying sentences from unstructured background knowledge such as plots, comments and reviews about the movie.

30 papers0 benchmarksTexts

ECtHR (European Court of Human Rights Cases)

ECtHR is a dataset comprising European Court of Human Rights cases, including annotations for paragraph-level rationales. This dataset comprises 11k ECtHR cases and can be viewed as an enriched version of the ECtHR dataset of Chalkidis et al. (2019), which did not provide ground truth for alleged article violations (articles discussed) and rationales. It is released with silver rationales obtained from references in court decisions, and gold rationales provided by ECHR-experienced lawyers

30 papers0 benchmarksTexts

AGQA (Action Genome Question Answering)

Action Genome Question Answering (AGQA) is a benchmark for compositional spatio-temporal reasoning. AGQA contains 192M unbalanced question answer pairs for 9.6K videos. It also contains a balanced subset of 3.9M question answer pairs, 3 orders of magnitude larger than existing benchmarks, that minimizes bias by balancing the answer distributions and types of question structures.

30 papers0 benchmarksTexts, Videos

VGG-SS (VGG-Sound Source)

VGG-SS (VGG Sound Source) is a benchmark for evaluating sound source localisation in videos. The dataset consists on a new set of annotations for the recently-introduced VGG-Sound dataset, where the sound sources visible in each video clip are explicitly marked with bounding box annotations. This dataset is 20 times larger than analogous existing ones, contains 5K videos spanning over 200 categories, and, differently from Flickr SoundNet, is video-based.

30 papers0 benchmarksVideos
PreviousPage 80 of 1000Next