TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Datasets

19,997 machine learning datasets

Filter by Modality

  • Images3,275
  • Texts3,148
  • Videos1,019
  • Audio486
  • Medical395
  • 3D383
  • Time series298
  • Graphs285
  • Tabular271
  • Speech199
  • RGB-D192
  • Environment148
  • Point cloud135
  • Biomedical123
  • LiDAR95
  • RGB Video87
  • Tracking78
  • Biology71
  • Actions68
  • 3d meshes65
  • Tables52
  • Music48
  • EEG45
  • Hyperspectral images45
  • Stereo44
  • MRI39
  • Physics32
  • Interactive29
  • Dialog25
  • Midi22
  • 6D17
  • Replay data11
  • Financial10
  • Ranking10
  • Cad9
  • fMRI7
  • Parallel6
  • Lyrics2
  • PSG2

19,997 dataset results

ShapeStacks

A simulation-based dataset featuring 20,000 stack configurations composed of a variety of elementary geometric primitives richly annotated regarding semantics and structural stability.

22 papers3 benchmarks

WIDER (Web Image Dataset for Event Recognition)

WIDER is a dataset for complex event recognition from static images. As of v0.1, it contains 61 event categories and around 50574 images annotated with event class labels.

22 papers1 benchmarksImages

Snopes

Fact-checking (FC) articles which contains pairs (multimodal tweet and a FC-article) from snopes.com.

22 papers1 benchmarksTexts

MINOS

MINOS is a simulator designed to support the development of multisensory models for goal-directed navigation in complex indoor environments. MINOS leverages large datasets of complex 3D environments and supports flexible configuration of multimodal sensor suites.

22 papers0 benchmarks3D, Environment

ReDWeb (Relative Depth from Web)

The ReDWeb dataset consists of 3600 RGB-RD image pairs collected from the Web. This dataset covers a wide range of scenes and features various non-rigid objects.

22 papers0 benchmarksRGB-D

ManySStuBs4J

The ManySStuBs4J corpus is a collection of simple fixes to Java bugs, designed for evaluating program repair techniques. We collect all bug-fixing changes using the SZZ heuristic, and then filter these to obtain a data set of small bug fix changes. These are single statement fixes, classified where possible into one of 16 syntactic templates which we call SStuBs. The dataset contains simple statement bugs mined from open-source Java projects hosted in GitHub. There are two variants of the dataset. One mined from the 100 Java Maven Projects and one mined from the top 1000 Java Projects.

22 papers0 benchmarks

CxC (Crisscrossed Captions)

Crisscrossed Captions (CxC) contains 247,315 human-labeled annotations including positive and negative associations between image pairs, caption pairs and image-caption pairs.

22 papers1 benchmarksImages, Texts

PhotoShape

The PhotoShape dataset consists of photorealistic, relightable, 3D shapes produced by the work proposed in the work of Park et al. (2021).

22 papers2 benchmarksImages

seeds

The examined group comprised kernels belonging to three different varieties of wheat: Kama, Rosa and Canadian, 70 elements each, randomly selected for the experiment. High quality visualization of the internal kernel structure was detected using a soft X-ray technique. It is non-destructive and considerably cheaper than other more sophisticated imaging techniques like scanning microscopy or laser technology. The images were recorded on 13x18 cm X-ray KODAK plates. Studies were conducted using combine harvested wheat grain originating from experimental fields, explored at the Institute of Agrophysics of the Polish Academy of Sciences in Lublin.

22 papers1 benchmarks

XGLUE

XGLUE is an evaluation benchmark XGLUE,which is composed of 11 tasks that span 19 languages. For each task, the training data is only available in English. This means that to succeed at XGLUE, a model must have a strong zero-shot cross-lingual transfer capability to learn from the English data of a specific task and transfer what it learned to other languages. Comparing to its concurrent work XTREME, XGLUE has two characteristics: First, it includes cross-lingual NLU and cross-lingual NLG tasks at the same time; Second, besides including 5 existing cross-lingual tasks (i.e. NER, POS, MLQA, PAWS-X and XNLI), XGLUE selects 6 new tasks from Bing scenarios as well, including News Classification (NC), Query-Ad Matching (QADSM), Web Page Ranking (WPR), QA Matching (QAM), Question Generation (QG) and News Title Generation (NTG). Such diversities of languages, tasks and task origin provide a comprehensive benchmark for quantifying the quality of a pre-trained model on cross-lingual natural lan

22 papers2 benchmarksTexts

EasyCom

The Easy Communications (EasyCom) dataset is a world-first dataset designed to help mitigate the cocktail party effect from an augmented-reality (AR) -motivated multi-sensor egocentric world view. The dataset contains AR glasses egocentric multi-channel microphone array audio, wide field-of-view RGB video, speech source pose, headset microphone audio, annotated voice activity, speech transcriptions, head and face bounding boxes and source identification labels. We have created and are releasing this dataset to facilitate research in multi-modal AR solutions to the cocktail party problem.

22 papers15 benchmarksAudio, Dialog, Images, RGB Video, Speech, Time series, Videos

2017 Robotic Instrument Segmentation Challenge

Segmentation of robotic instruments is an important problem for robotic assisted minimially invasive surgery. It can be used for simple 2D applications such as overlay masking or 2D tracking but also for more complex 3D tasks such as pose estimation. In this challenge we invite applicants to participate in 3 different tasks: binary segmentation, multi-label segmentation and instrument recognition. Binary segmentation involves just separating the image into instruments and background, whereas multi-label segmentation requires the user to also recognize which parts of the instrument body correspond to the different articulated parts of a da Vinci robotic instrument. The final recogition task tests whether the user can recognize which segmentation corresponds to which da Vinci instrument type.

22 papers4 benchmarks

Market-1501-C

Market-1501-C is an evaluation set that consists of algorithmically generated corruptions applied to the Market-1501 test-set. These corruptions consist of Noise: Gaussian, shot, impulse, and speckle; Blur: defocus, frosted glass, motion, zoom, and Gaussian; Weather: snow, frost, fog, brightness, spatter, and rain; Digital: contrast, elastic, pixel, JPEG compression, and saturate. Each corruption has five severity levels, resulting in 100 distinct corruptions.

22 papers6 benchmarksImages

Douban Conversation Corpus

We release Douban Conversation Corpus, comprising a training data set, a development set and a test set for retrieval based chatbot. The statistics of Douban Conversation Corpus are shown in the following table.

22 papers0 benchmarksTexts

SLUE (Spoken Language Understanding Evaluation)

Spoken Language Understanding Evaluation (SLUE) is a suite of benchmark tasks for spoken language understanding evaluation. It consists of limited-size labeled training sets and corresponding evaluation sets. This resource would allow the research community to track progress, evaluate pre-trained representations for higher-level tasks, and study open questions such as the utility of pipeline versus end-to-end approaches. The first phase of the SLUE benchmark suite consists of named entity recognition (NER), sentiment analysis (SA), and ASR on the corresponding datasets.

22 papers10 benchmarksSpeech

TopiOCQA

TopiOCQA (pronounced Tapioca) is an open-domain conversational dataset with topic switches on Wikipedia. TopiOCQA contains 3,920 conversations with information-seeking questions and free-form answers. On average, a conversation in the dataset spans 13 question-answer turns and involves four topics (documents). TopiOCQA poses a challenging test-bed for models, where efficient retrieval is required on multiple turns of the same conversation, in conjunction with constructing valid responses using conversational history.

22 papers0 benchmarksTexts

3D Cars

Car CAD models from "3d object detection and viewpoint estimation with a deformable 3d cuboid model" were used to generate the dataset. For each of the 199 car models, the authors generated $64\times64$ color renderings from 24 rotation angles each offset by 15 degrees, as well as from 4 different camera elevations.

22 papers0 benchmarksImages

MIMIC-IV

Retrospectively collected medical data has the opportunity to improve patient care through knowledge discovery and algorithm development. Broad reuse of medical data is desirable for the greatest public good, but data sharing must be done in a manner which protects patient privacy.

22 papers0 benchmarksMedical, Tabular

BurstSR

BurstSR is a dataset consisting of smartphone bursts and high-resolution DSLR ground-truth

22 papers12 benchmarks

BACE (β-secretase enzyme)

The BACE dataset focuses on inhibitors of human beta-secretase 1 (BACE-1). It includes both quantitative (IC50 values) and qualitative (binary labels) binding results. The dataset comprises small molecule inhibitors across a wide range of affinities, spanning three orders of magnitude (from nanomolar to micromolar IC50 values). Specifically, it provides: 154 BACE inhibitors for affinity prediction. 20 BACE inhibitors for pose prediction. 34 BACE inhibitors for free energy prediction.

22 papers1 benchmarks
PreviousPage 98 of 1000Next