TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Datasets

3,275 machine learning datasets

Filter by Modality

  • Images3,275
  • Texts3,148
  • Videos1,019
  • Audio486
  • Medical395
  • 3D383
  • Time series298
  • Graphs285
  • Tabular271
  • Speech199
  • RGB-D192
  • Environment148
  • Point cloud135
  • Biomedical123
  • LiDAR95
  • RGB Video87
  • Tracking78
  • Biology71
  • Actions68
  • 3d meshes65
  • Tables52
  • Music48
  • EEG45
  • Hyperspectral images45
  • Stereo44
  • MRI39
  • Physics32
  • Interactive29
  • Dialog25
  • Midi22
  • 6D17
  • Replay data11
  • Financial10
  • Ranking10
  • Cad9
  • fMRI7
  • Parallel6
  • Lyrics2
  • PSG2
Clear filter

3,275 dataset results

WikiWeb2M (Wikipedia Webpage 2M)

Wikipedia Webpage 2M (WikiWeb2M) is a multimodal open source dataset consisting of over 2 million English Wikipedia articles. It is created by rescraping the ∼2M English articles in WIT. Each webpage sample includes the page URL and title, section titles, text, and indices, images and their captions.

1 papers0 benchmarksImages, Texts

ParsVQA-Caps

Despite recent advances in vision-and-language tasks, most progress is still focused on resource-rich languages such as English. Furthermore, widespread vision-and-language datasets directly adopt images representative of American or European cultures resulting in bias. Hence we introduce ParsVQA-Caps, the first benchmark in Persian for Visual Question Answering and Image Captioning tasks. We utilize two ways to collect datasets for each task, human-based and template-based for VQA and human-based and web-based for image captioning. The image captioning dataset consists of over 7.5k images and about 9k captions. The VQA dataset consists of almost 11k images and 28.5k question and answer pairs with short and long answers usable for both classification and generation VQA.

1 papers0 benchmarksImages, Texts

A View From Somewhere (AVFS)

A View From Somewhere (AVFS)—a dataset of 638,180 face similarity judgments over 4,921 faces. Each judgment corresponds to the odd-one-out (i.e., least similar) face in a triplet of faces and is accompanied by both the identifier and demographic attributes of the annotator who made the judgment.

1 papers0 benchmarksImages

COVIDx CXR-3

COVIDx CXR-3 is an open access benchmark dataset that we generated, comprising 30,882 CXR images across 17,026 patient cases. Images may be added over time to improve the dataset.

1 papers1 benchmarksImages, Medical

HAC (Hybrid Adverse Conditions)

HAC is a dataset for learning and benchmarking arbitrary Hybrid Adverse Conditions restoration. HAC contains 31 scenarios composed of an arbitrary combination of five common weather, with a total of 316K adverse-weather/clean pairs.

1 papers0 benchmarksImages

Morphological Classification of Galaxies

Dataset can be used by anyone who is interested to perform morphological classification of galaxies. Originally dataset provided by Kaggle user Jay Lin (https://www.kaggle.com/jay1985) 4 years ago. Dataset was used in conference paper "Morphological Classification of Galaxies Using SpinalNet"

1 papers0 benchmarksImages

Honeycombs in Concrete (Honeycombs in Concrete Instance Segmentation)

The directory HiCIS contains two datasets for instance segmentation of honeycombs in concrete in COCO Format. The datasets orginate from images scraped from the internet and the other one is provided by Metis Systems AG. The directory HiCC/web contains the dataset using the images from the internet and HICC/metis contains the dataset using the images provided by Metis Systems AG as part of the research project Smart Design and Construction (SDaC).

1 papers0 benchmarksImages

Slovo: Russian Sign Language Dataset

We introduce a large-scale video dataset Slovo for Russian Sign Language task. Slovo dataset size is about 16 GB, and it contains 20400 RGB videos for 1000 sign language gestures from 194 singers. Each class has 20 samples. The dataset is divided into training set and test set by subject user_id. The training set includes 15300 videos, and the test set includes 5100 videos. The total video recording time is ~9.2 hours. About 35% of the videos are recorded in HD format, and 65% of the videos are in FullHD resolution. The average video length with gesture is 50 frames.

1 papers1 benchmarksImages, Videos

DermSynth3D (3DBodyTex.DermSynth3D)

A dataset of 100K synthetic images of skin lesions, ground-truth (GT) segmentations of lesions and healthy skin, GT segmentations of seven body parts (head, torso, hips, legs, feet, arms and hands), and GT binary masks of non-skin regions in the texture maps of 215 scans from the 3DBodyTex.v1 dataset [2], [3] created using the framework described in [1]. The dataset is primarily intended to enable the development of skin lesion analysis methods. Synthetic image creation consisted of two main steps. First, skin lesions from the Fitzpatrick 17k dataset were blended onto skin regions of high-resolution three-dimensional human scans from the 3DBodyTex dataset [2], [3]. Second, two-dimensional renders of the modified scans were generated.

1 papers0 benchmarks3D, 3d meshes, Images, Medical

Stained mice brain blood vessels. Confocal-LFM

3D confocal stacks with corresponding 2D Light-field microscope images

1 papers0 benchmarks3D, Biology, Images

PopulationGrowthDataset_Kigali

This dataset contains annual Sentinel-2 MSI composites (wet and dry season) for Kigali for the period 2016-2020. In addition, a metadata file containing population count at the grid level (100 x 100 m) for 2020 and at the census level (administrative units) for 2016 and 2020 is provided. Ancillary data such as the administrative boundaries of Kigali are also available.

1 papers0 benchmarksImages

Iran's Built Heritage Binary Image Classification Dataset

Iran's Built Heritage Binary Image Classification Dataset contains approximately 10,500 CHB images gathered from four different sources:

1 papers0 benchmarksImages

HighwayPavementCrackDetection

The image comes from the CCD camera of the highway measurement vehicle. Cracks and sealed cracks have been labeled. The form of labels is different from traditional block annotations, but uses redundant and dense annotation boxes. Some of the data is manually annotated, while others are model generated annotations that have undergone careful manual inspection.

1 papers0 benchmarksImages

PTCGA200 (Patch TCGA in 200 microns by 512 px)

PTCGA200 is a public pathological H&E image datasets from Patch TCGA in 200 microns by 512 px.

1 papers0 benchmarksImages

ACCT Data Repository (ACCT is a fast and accessible automatic cell counting tool using machine learning for 2D image segmentation)

This dataset is a collection of fluorescent images from mice in order to test an automatic cell counting tool that we developed. 62 images viewed from 2 or 3 different fields of views are shown. In brief, the dataset was derived from brain sections of a model for HIV-induced brain injury (HIVgp120tg), which expresses soluble gp120 envelope protein in astrocytes under the control of a modified GFAP promoter. The mice were in a mixed C57BL/6.129/SJL genetic background, and two genotypes of 9 month old male mice were selected: wild type controls (Resting, n = 3) and transgenic littermates (HIVgp120tg, Activated, n = 3). No randomization was performed. HIVgp120tg mice show among other hallmarks of human HIV neuropathology an increase in microglia numbers which indicates activation of the cells compared to non-transgenic littermate controls.

1 papers0 benchmarksBiology, Biomedical, Images, Medical

ISOD (Indoor Small Object Dataset)

ISOD contains 2,000 manually labelled RGB-D images from 20 diverse sites, each featuring over 30 types of small objects randomly placed amidst the items already present in the scenes. These objects, typically ≤3cm in height, include LEGO blocks, rags, slippers, gloves, shoes, cables, crayons, chalk, glasses, smartphones (and their cases), fake banana peels, fake pet waste, and piles of toilet paper, among others. These items were chosen because they either threaten the safe operation of indoor mobile robots or create messes if run over.

1 papers0 benchmarksImages, Time series

Belfort (The Belfort dataset: Handwritten Text Recognition from Crowdsourced Annotations)

The Belfort dataset This dataset includes minutes of Belfort municipal council drawn up between 1790 and 1946. Documents include deliberations, lists of councillors, convocations, and agendas. It includes 24,105 text-line images that were automatically detected from pages. Up to 4 transcriptions are available for each line image: two from humans, and two from automatic models.

1 papers4 benchmarksImages, Texts

GenPlot (GenPlot: 500k pre-generated plots)

This dataset contains the pre-generated dataset referenced in the GenPlot Paper.

1 papers0 benchmarksImages, Texts

MI-Motion (Multi-Person Interaction Motion)

Multi-Person Interaction Motion (MI-Motion) Dataset includes skeleton sequences of multiple individuals collected by motion capture systems and refined and synthesized using a game engine. The dataset contains 167k frames of interacting people's skeleton poses and is categorized into 5 different activity scenes.

1 papers0 benchmarks3D, Images

DeepGraviLens

DeepGraviLens is a data set of simulated gravitational lenses consisting of images associated with brightness variation time series. In this dataset, both non-transient and transient phenomena (supernovae explosions) are simulated.

1 papers0 benchmarksImages, Time series
PreviousPage 130 of 164Next