TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Datasets

3,275 machine learning datasets

Filter by Modality

  • Images3,275
  • Texts3,148
  • Videos1,019
  • Audio486
  • Medical395
  • 3D383
  • Time series298
  • Graphs285
  • Tabular271
  • Speech199
  • RGB-D192
  • Environment148
  • Point cloud135
  • Biomedical123
  • LiDAR95
  • RGB Video87
  • Tracking78
  • Biology71
  • Actions68
  • 3d meshes65
  • Tables52
  • Music48
  • EEG45
  • Hyperspectral images45
  • Stereo44
  • MRI39
  • Physics32
  • Interactive29
  • Dialog25
  • Midi22
  • 6D17
  • Replay data11
  • Financial10
  • Ranking10
  • Cad9
  • fMRI7
  • Parallel6
  • Lyrics2
  • PSG2
Clear filter

3,275 dataset results

MALF (Multi-Attribute Labelled Faces)

The MALF dataset is a large dataset with 5,250 images annotated with multiple facial attributes and it is specifically constructed for fine grained evaluation.

15 papers0 benchmarksImages

Washington RGB-D

Washington RGB-D is a widely used testbed in the robotic community, consisting of 41,877 RGB-D images organized into 300 instances divided in 51 classes of common indoor objects (e.g. scissors, cereal box, keyboard etc). Each object instance was positioned on a turntable and captured from three different viewpoints while rotating.

15 papers0 benchmarksImages, RGB-D

MSRA Hand

MSRA Hands is a dataset for hand tracking. In total 6 subjects' right hands are captured using Intel's Creative Interactive Gesture Camera. Each subject is asked to make various rapid gestures in a 400-frame video sequence. To account for different hand sizes, a global hand model scale is specified for each subject: 1.1, 1.0, 0.9, 0.95, 1.1, 1.0 for subject 1~6, respectively. The camera intrinsic parameters are: principle point = image center(160, 120), focal length = 241.42. The depth image is 320x240, each .bin file stores the depth pixel values in row scanning order, which are 320240 floats. The unit is millimeters. The bin file is binary and needs to be opened with std::ios::binary flag. joint.txt file stores 400 frames x 21 hand joints per frame. Each line has 3 * 21 = 63 floats for 21 3D points in (x, y, z) coordinates. The 21 hand joints are: wrist, index_mcp, index_pip, index_dip, index_tip, middle_mcp, middle_pip, middle_dip, middle_tip, ring_mcp, ring_pip, ring_dip, ring_tip,

15 papers0 benchmarksImages

CASIA V2

CASIA V2 is a dataset for forgery classification. It contains 4795 images, 1701 authentic and 3274 forged.

15 papers0 benchmarksImages

ISIC 2017 Task 1

The ISIC 2017 dataset was published by the International Skin Imaging Collaboration (ISIC) as a large-scale dataset of dermoscopy images. The Task 1 challenge dataset for lesion segmentation contains 2,000 images for training with ground truth segmentations (2000 binary mask images).

15 papers0 benchmarksImages, Medical

MSK

The MSK dataset is a dataset for lesion recognition from the Memorial Sloan-Kettering Cancer Center. It is used as part of the ISIC lesion recognition challenges.

15 papers0 benchmarksImages, Medical

4Seasons

4Seasons is adataset covering seasonal and challenging perceptual conditions for autonomous driving.

15 papers0 benchmarksImages

Fakeddit

Fakeddit is a novel multimodal dataset for fake news detection consisting of over 1 million samples from multiple categories of fake news. After being processed through several stages of review, the samples are labeled according to 2-way, 3-way, and 6-way classification categories through distant supervision.

15 papers0 benchmarksImages, Texts

Paris-Lille-3D

The Paris-Lille-3D is a Benchmark on Point Cloud Classification. The Point Cloud has been labeled entirely by hand with 50 different classes. The dataset consists of around 2km of Mobile Laser System point cloud acquired in two cities in France (Paris and Lille).

15 papers1 benchmarksImages

SYSU-30k

SYSU-30k contains 30k categories of persons, which is about 20 times larger than CUHK03 (1.3k categories) and Market1501 (1.5k categories), and 30 times larger than ImageNet (1k categories). SYSU-30k contains 29,606,918 images. Moreover, SYSU-30k provides not only a large platform for the weakly supervised ReID problem but also a more challenging test set that is consistent with the realistic setting for standard evaluation.

15 papers2 benchmarksImages

ACRE (Abstract Causal REasoning)

Abstract Causal REasoning (ACRE) is a dataset for the systematic evaluation of current vision systems in causal induction, i.e., identifying unobservable mechanisms that lead to the observable relations among variables.

15 papers0 benchmarksImages

KolektorSDD2 (Kolektor Surface-Defect Dataset 2)

KolektorSDD2 is a surface-defect detection dataset with over 3000 images containing several types of defects, obtained while addressing a real-world industrial problem.

15 papers9 benchmarksImages

RoadAnomaly21

RoadAnomaly21 is a dataset for anomaly segmentation, the task of identify the image regions containing objects that have never been seen during training. It consists of an evaluation dataset of 100 images with pixel-level annotations. Each image contains at least one anomalous object, e.g. animals or unknown vehicles. The anomalies can appear anywhere in the image and widely differ in size, covering from 0.5% to 40% of the image

15 papers0 benchmarksImages

MIAP (More Inclusive Annotations for People)

MIAP is a dataset created by obtaining a new set of annotations on a subset of the Open Images dataset, containing bounding boxes and attributes for all of the people visible in those images, as the original Open Images dataset annotations are not exhaustive, with bounding boxes and attribute labels for only a subset of the classes in each image.

15 papers0 benchmarksImages

UFPR-ALPR

This dataset includes 4,500 fully annotated images (over 30,000 license plate characters) from 150 vehicles in real-world scenarios where both the vehicle and the camera (inside another vehicle) are moving.

15 papers1 benchmarksImages

PubTables-1M (PubMed Tables One Million)

The goal of PubTables-1M is to create a large, detailed, high-quality dataset for training and evaluating a wide variety of models for the tasks of table detection, table structure recognition, and functional analysis. It contains:

15 papers0 benchmarksImages, Texts

HaGRID (HaGRID - HAnd Gesture Recognition Image Dataset)

We introduce a large image dataset HaGRID (HAnd Gesture Recognition Image Dataset) for hand gesture recognition (HGR) systems. You can use it for image classification or image detection tasks. Proposed dataset allows to build HGR systems, which can be used in video conferencing services (Zoom, Skype, Discord, Jazz etc.), home automation systems, the automotive sector, etc.

15 papers0 benchmarksImages

MMHS150k (Multimodal Hate Speech)

Existing hate speech datasets contain only textual data. We create a new manually annotated multimodal hate speech dataset formed by 150,000 tweets, each one of them containing text and an image. We call the dataset MMHS150K.

15 papers0 benchmarksImages, Texts

OCTID (Optical Coherence Tomography Image Retinal Database)

An open-source Optical Coherence Tomography Image Database containing different retinal OCT images with various pathological conditions. This comprehensive open-access database contains over 500 high-resolution images categorized into different pathological conditions. The image classes include Normal (NO), Macular Hole (MH), Age-related Macular Degeneration (AMD), Central Serous Retinopathy (CSR), and Diabetic Retinopathy (DR).

15 papers0 benchmarksImages

GoodsAD

The GoodsAD dataset contains 6124 images with 6 categories of common supermarket goods. Each category contains multiple goods. All images are acquired with 3000 × 3000 high-resolution. The object locations in the images are not aligned. Most objects are in the center of the images and one image only contains a single object. Most anomalies occupy only a small fraction of image pixels. Both image-level and pixel-level annotations are provided.

15 papers6 benchmarksImages
PreviousPage 43 of 164Next