TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Datasets

19,997 machine learning datasets

Filter by Modality

  • Images3,275
  • Texts3,148
  • Videos1,019
  • Audio486
  • Medical395
  • 3D383
  • Time series298
  • Graphs285
  • Tabular271
  • Speech199
  • RGB-D192
  • Environment148
  • Point cloud135
  • Biomedical123
  • LiDAR95
  • RGB Video87
  • Tracking78
  • Biology71
  • Actions68
  • 3d meshes65
  • Tables52
  • Music48
  • EEG45
  • Hyperspectral images45
  • Stereo44
  • MRI39
  • Physics32
  • Interactive29
  • Dialog25
  • Midi22
  • 6D17
  • Replay data11
  • Financial10
  • Ranking10
  • Cad9
  • fMRI7
  • Parallel6
  • Lyrics2
  • PSG2

19,997 dataset results

CurveLanes

CurveLanes is a new benchmark lane detection dataset with 150K lanes images for difficult scenarios such as curves and multi-lanes in traffic lane detection. It is collected in real urban and highway scenarios in multiple cities in China. It is the largest lane detection dataset so far and establishes a more challenging benchmark for the community.

19 papers10 benchmarks

FoCus (Call for Customized Conversation: Customized Conversation Grounding Persona and Knowledge)

We introduce a new dataset, called FoCus, that supports knowledge-grounded answers that reflect user’s persona. One of the situations in which people need different types of knowledge, based on their preferences, occurs when they travel around the world.

19 papers0 benchmarksTexts

Distinctions-646

Dinstinctions-646 are composed of 646 foreground images with manually annotated alpha mattes

19 papers5 benchmarks

Argoverse-HD

Argoverse-HD is a dataset built for streaming object detection, which encompasses real-time object detection, video object detection, tracking, and short-term forecasting. It contains the video data from Argoverse 1.1 with our own MS COCO-style bounding box annotations with track IDs. The annotations are backward-compatible with COCO as one can directly evaluate COCO pre-trained models on this dataset to estimate the efficiency or the cross-dataset generalization capability of the models. The dataset contains high-quality and temporally-dense annotations for high-resolution videos (1920 x 1200 @ 30 FPS). Overall, there are 70,000 image frames and 1.3 million bounding boxes.

19 papers0 benchmarksImages, Videos

BCI (Breast Cancer Immunohistochemical Image Generation)

The evaluation of human epidermal growth factor receptor 2 (HER2) expression is essential to formulate a precise treatment for breast cancer. The routine evaluation of HER2 is conducted with immunohistochemical techniques (IHC), which is very expensive. Therefore, we propose a breast cancer immunohistochemical (BCI) benchmark attempting to synthesize IHC data directly with the paired hematoxylin and eosin (HE) stained images. The dataset contains 4870 registered image pairs, covering a variety of HER2 expression levels (0, 1+, 2+, 3+).

19 papers6 benchmarksBiomedical, Images, Medical

ScribbleKITTI

ScribbleKITTI is a scribble-annotated dataset for LiDAR semantic segmentation.

19 papers14 benchmarks

STARSS22 (Sony-TAu Realistic Spatial Soundscapes 2022)

The Sony-TAu Realistic Spatial Soundscapes 2022(STARSS22) dataset consists of recordings of real scenes captured with high channel-count spherical microphone array (SMA). The recordings are conducted from two different teams at two different sites, Tampere University in Tammere, Finland, and Sony facilities in Tokyo, Japan. Recordings at both sites share the same capturing and annotation process, and a similar organization. They are organized in sessions, corresponding to distinct rooms, human participants, and sound making props with a few exceptions.

19 papers5 benchmarksAudio

VideoAttentionTarget

A dataset with fully annotated attention targets in video for attention target estimation.

19 papers3 benchmarksVideos

QAMPARI

QAMPARI is an ODQA benchmark, where question answers are lists of entities, spread across many paragraphs. It was created by (a) generating questions with multiple answers from Wikipedia's knowledge graph and tables, (b) automatically pairing answers with supporting evidence in Wikipedia paragraphs, and (c) manually paraphrasing questions and validating each answer.

19 papers0 benchmarksTexts

Film (60%/20%/20% random splits)

Node classification on Film with 60%/20%/20% random splits for training/validation/test.

19 papers1 benchmarksGraphs

Squirrel (60%/20%/20% random splits)

Node classification on Squirrel with 60%/20%/20% random splits for training/validation/test.

19 papers1 benchmarksGraphs

Deezer-Europe

Node classification on Deezer Europe with 50%/25%/25% random splits for training/validation/test.

19 papers1 benchmarksGraphs

Squirrel (48%/32%/20% fixed splits)

Node classification on Squirrel with the fixed 48%/32%/20% splits provided by Geom-GCN.

19 papers2 benchmarksGraphs

LVOS

LVOS is a dataset for long-term video object segmentation (VOS). It consists of 220 videos with a total duration of 421 minutes. The videos in our LVOS last 1.59 minutes on average, which is 20 times longer than videos in existing VOS datasets. Each video includes various attributes, especially challenges deriving from the wild, such as long-term reappearing and cross-temporal similar objects.

19 papers0 benchmarksVideos

DELIVER

DELIVER is an arbitrary-modal segmentation benchmark, covering Depth, LiDAR, multiple Views, Events, and RGB. Aside from this, the dataset is also used in four severe weather conditions as well as five sensor failure cases to exploit modal complementarity and resolve partial outages. It is designed for the tasks of arbitrary-modal semantic segmentation.

19 papers4 benchmarksLiDAR

MCubeS (Multimodal Material Segmentation Dataset)

Multimodal material segmentation (MCubeS) dataset contains 500 sets of images from 42 street scenes. Each scene has images for four modalities: RGB, angle of linear polarization (AoLP), degree of linear polarization (DoLP), and near-infrared (NIR). The dataset provides annotated ground truth labels for both material and semantic segmentation for every pixel. The dataset is divided training set with 302 image sets, validation set with 96 image sets, and test set with 102 image sets. Each image has 1224 x 1024 pixels and a total of 20 class labels per pixel.

19 papers2 benchmarksHyperspectral images, Images

Hi4D

Hi4D contains 4D textured scans of 20 subject pairs, 100 sequences, and a total of more than 11K frames. Hi4D contains rich interaction centric annotations in 2D and 3D alongside accurately registered parametric body models.

19 papers0 benchmarks3D, 3d meshes

SHD (Spiking Heidelberg Digits)

The Spiking Heidelberg Digits (SHD) dataset is an audio-based classification dataset of 1k spoken digits ranging from zero to nine in the English and German languages. The audio waveforms have been converted into spike trains using an artificial model of the inner ear and parts of the ascending auditory pathway. The SHD dataset has 8,156 training and 2,264 test samples. A full description of the dataset and how it was created can be found in the paper below. Please cite this paper if you make use of the dataset.

19 papers2 benchmarks

PKLot (A Robust Dataset for Parking Lot Classification)

The PKLot dataset contains 12,417 images of parking lots and 695,899 images of parking spaces segmented from them, which were manually checked and labeled. All images were acquired at the parking lots of the Federal University of Parana (UFPR) and the Pontificial Catholic University of Parana (PUCPR), both located in Curitiba, Brazil.

19 papers2 benchmarksImages

VC-Clothes

Person re-identification (Reid) is now an active research topic for AI-based video surveillance applications such as specific person search, but the practical issue that the target person(s) may change clothes (clothes inconsistency problem) has been overlooked for long. For the first time, this paper systematically studies this problem. We first overcome the difficulty of lack of suitable dataset, by collecting a small yet representative real dataset for testing whilst building a large realistic synthetic dataset for training and deeper studies. Facilitated by our new datasets, we are able to conduct various interesting new experiments for studying the influence of clothes inconsistency. We find that changing clothes makes Reid a much harder problem in the sense of bringing difficulties to learning effective representations and also challenges the generalization ability of previous Reid models to identify persons with unseen (new) clothes. Representative existing Reid models are adopt

19 papers3 benchmarks
PreviousPage 107 of 1000Next