TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Datasets

19,997 machine learning datasets

Filter by Modality

  • Images3,275
  • Texts3,148
  • Videos1,019
  • Audio486
  • Medical395
  • 3D383
  • Time series298
  • Graphs285
  • Tabular271
  • Speech199
  • RGB-D192
  • Environment148
  • Point cloud135
  • Biomedical123
  • LiDAR95
  • RGB Video87
  • Tracking78
  • Biology71
  • Actions68
  • 3d meshes65
  • Tables52
  • Music48
  • EEG45
  • Hyperspectral images45
  • Stereo44
  • MRI39
  • Physics32
  • Interactive29
  • Dialog25
  • Midi22
  • 6D17
  • Replay data11
  • Financial10
  • Ranking10
  • Cad9
  • fMRI7
  • Parallel6
  • Lyrics2
  • PSG2

19,997 dataset results

Shellcode_IA32

Shellcode_IA32 is a dataset containing 20 years of shellcodes from a variety of sources is the largest collection of shellcodes in assembly available to date.

6 papers2 benchmarks

SimJEB (Simulated Jet Engine Bracket)

Simulated Jet Engine Bracket Dataset (SimJEB) is a public collection of crowdsourced mechanical brackets and high-fidelity structural simulations designed specifically for surrogate modeling. SimJEB models are more complex, diverse, and realistic than the synthetically generated datasets commonly used in parametric surrogate model evaluation. In contrast to existing engineering shape collections, SimJEB's models are all designed for the same engineering function and thus have consistent structural loads and support conditions. The models in SimJEB were collected from the original submissions to the GrabCAD Jet Engine Bracket Challenge: an open engineering design competition with over 700 hand-designed CAD entries from 320 designers representing 56 countries. Each model has been cleaned, categorized, meshed, and simulated with finite element analysis according to the original competition specifications. The result is a collection of diverse, high-quality and application-focused designs

6 papers0 benchmarks

VideoLT

VideoLT is a large-scale long-tailed video recognition dataset that contains 256,218 untrimmed videos, annotated into 1,004 classes with a long-tailed distribution.

6 papers0 benchmarksVideos

XL-BEL

XL-BEL is a benchmark for cross-lingual biomedical entity linking (XL-BEL). The benchmark spans 10 typologically diverse languages.

6 papers0 benchmarksTexts

OTTers

OTTers is a dataset of human one-turn topic transitions. In this task, models must connect two topics in a cooperative and coherent manner, by generating a "bridging" utterance connecting the new topic tot he topic of the previous conversation turn.

6 papers0 benchmarksTexts

ConvoSumm

ConvoSumm is a suite of four datasets to evaluate a model’s performance on a broad spectrum of conversation data.

6 papers0 benchmarksTexts

BAAI-VANJEE

BAAI-VANJEE is a dataset for benchmarking and training various computer vision tasks such as 2D/3D object detection and multi-sensor fusion. The BAAI-VANJEE roadside dataset consists of LiDAR data and RGB images collected by VANJEE smart base station placed on the roadside about 4.5m high. This dataset contains 2500 frames of LiDAR data, 5000 frames of RGB images, including 20% collected at the same time. It also contains 12 classes of objects, 74K 3D object annotations and 105K 2D object annotations.

6 papers0 benchmarksImages, LiDAR

Swords (Stanford Word Substitution benchmark)

Swords (Standford Word Substitution) is a benchmark for lexical substitution, the task of finding appropriate substitutes for a target word in a context. Swords is composed of context, target word, and substitute triples (c, w, w'), each of which has a score that indicates the appropriateness of the substitute.

6 papers0 benchmarksTexts

BANKING77-OOS

A dataset with a single banking domain, includes both general Out-of-Scope (OOD-OOS) queries and In-Domain but Out-of-Scope (ID-OOS) queries, where ID-OOS queries are semantically similar intents/queries with in-scope intents. BANKING77 originally includes 77 intents. BANKING77-OOS includes 50 in-scope intents in this dataset, and the ID-OOS queries are built up based on 27 held-out in-scope intents.

6 papers1 benchmarks

PHASE (PHysically-grounded Abstract Social Events)

PHASE is a dataset of physically-grounded abstract social events, that resemble a wide range of real-life social interactions by including social concepts such as helping another agent. PHASE consists of 2D animations of pairs of agents moving in a continuous space generated procedurally using a physics engine and a hierarchical planner. Agents have a limited field of view, and can interact with multiple objects, in an environment that has multiple landmarks and obstacles. Using PHASE, we design a social recognition task and a social prediction task. PHASE is validated with human experiments demonstrating that humans perceive rich interactions in the social events, and that the simulated agents behave similarly to humans.

6 papers0 benchmarks

WNUT 2020 (WNUT-2020 Task 1 Overview: Extracting Entities and Relations from Wet Lab Protocols)

The training and development dataset for our task was taken from previous work on wet lab corpus (Kulkarni et al., 2018) that consists of from the 623 protocols. We excluded the eight duplicate protocols from this dataset and then re-annotated the 615 unique protocols in BRAT (Stenetorp et al., 2012).

6 papers6 benchmarks

JerichoWorld

JerichoWorld is a dataset that enables the creation of learning agents that can build knowledge graph-based world models of interactive narratives. Interactive narratives -- or text-adventure games -- are partially observable environments structured as long puzzles or quests in which an agent perceives and interacts with the world purely through textual natural language. Each individual game typically contains hundreds of locations, characters, and objects -- each with their own unique descriptions -- providing an opportunity to study the problem of giving language-based agents the structured memory necessary to operate in such worlds.

6 papers5 benchmarksTexts

MuSeRC (Russian Multi-Sentence Reading Comprehension)

We present a reading comprehension challenge in which questions can only be answered by taking into account information from multiple sentences. The dataset is the first to study multi-sentence inference at scale, with an open-ended set of question types that requires reasoning skills.

6 papers2 benchmarksTexts

RCB (Russian Commitment Bank)

The Russian Commitment Bank is a corpus of naturally occurring discourses whose final sentence contains a clause-embedding predicate under an entailment cancelling operator (question, modal, negation, antecedent of conditional).

6 papers2 benchmarksTexts

Red MiniImageNet 20% label noise

Part of the Controlled Noisy Web Labels Dataset.

6 papers3 benchmarks

Red MiniImageNet 40% label noise

Part of the Controlled Noisy Web Labels Dataset.

6 papers3 benchmarks

Red MiniImageNet 80% label noise

Part of the Controlled Noisy Web Labels Dataset.

6 papers3 benchmarks

OpenEA Benchmark

1.0 Version of OpenEA benchmark datasets. Please use the updated 2.0 version, that has been subsequently released.

6 papers0 benchmarksGraphs

ZS-F-VQA

The ZS-F-VQA dataset is a new split of the F-VQA dataset for zero-shot problem. Firstly we obtain the original train/test split of F-VQA dataset and combine them together to filter out the triples whose answers appear in top-500 according to its occurrence frequency. Next, we randomly divide this set of answers into new training split (a.k.a. seen) $\mathcal{A}_s$ and testing split (a.k.a. unseen) $\mathcal{A}_u$ at the ratio of 1:1. With reference to F-VQA standard dataset, the division process is repeated 5 times. For each $(i,q,a)$ triplet in original F-VQA dataset, it is divided into training set if $a \in \mathcal{A}_s$. Else it is divided into testing set. The overlap of answer instance between training and testing set in F-VQA are $2565$ compared to $0$ in ZS-F-VQA.

6 papers1 benchmarksGraphs, Images, Texts

UDD

UDD is an underwater open-sea farm object detection dataset. UDD consists of 3 categories (seacucumber, seaurchin, and scallop) with 2,227 images. It's the first dataset collected in a real open-sea farm for underwater robot picking.

6 papers0 benchmarks
PreviousPage 199 of 1000Next