TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Datasets

3,275 machine learning datasets

Filter by Modality

  • Images3,275
  • Texts3,148
  • Videos1,019
  • Audio486
  • Medical395
  • 3D383
  • Time series298
  • Graphs285
  • Tabular271
  • Speech199
  • RGB-D192
  • Environment148
  • Point cloud135
  • Biomedical123
  • LiDAR95
  • RGB Video87
  • Tracking78
  • Biology71
  • Actions68
  • 3d meshes65
  • Tables52
  • Music48
  • EEG45
  • Hyperspectral images45
  • Stereo44
  • MRI39
  • Physics32
  • Interactive29
  • Dialog25
  • Midi22
  • 6D17
  • Replay data11
  • Financial10
  • Ranking10
  • Cad9
  • fMRI7
  • Parallel6
  • Lyrics2
  • PSG2
Clear filter

3,275 dataset results

DLO Instance Segmentation dataset (DLO Instance Segmentation dataset generated by Blender)

Contains ~60000 HD images of Deformable Linear Objects (DLOs) generated using blender. The dataset contains a variety of industrial-looking backgrounds and contains instance segmentation masks. The main task for this dataset is DLO instance segmentation.

1 papers0 benchmarksImages

SOMPT22 (Surveillance Oriented Multi-Pedestrian Tracking Dataset (SOMPT22))

SOMPT22 is a multi-object tracking (MOT) benchmark focused on surveillance-style pedestrian tracking.

1 papers0 benchmarksImages, Tracking, Videos

BlurRF-Synth

The first large-scale dataset for training and evaluating novel-view synthesis from blurred images.

1 papers0 benchmarks3D, Images

BlurRF-Real

A real-world low-light camera motion blur dataset for evaluating deblurring radiance fields methods.

1 papers0 benchmarks3D, Images

Wider-Test-200

This Wider-Test-200 dataset is introduced in the following paper: "Towards Unsupervised Blind Face Restoration using Diffusion Prior"

1 papers0 benchmarksImages

PsOCR (Pashto OCR Dataset)

PsOCR is a large-scale synthetic dataset for Optical Character Recognition in low-resource Pashto language.

1 papers0 benchmarksImages, Tabular, Texts

MalVis (MalVis: A Large-Scale Android Malware Visualization Dataset and Framework for Improved Classification)

Click to add a brief description of the dataset (Markdown and LaTeX enabled).

1 papers0 benchmarksImages

synRailObs

The synRailObs contains following categories: - Person - Rocks - Vehicles - Moto-cars - Animals

1 papers0 benchmarksImages

MVTec-AC

MVTec-AC is a curated refinement of the widely-used MVTec-AD dataset, specifically designed for anomaly classification—distinguishing between different types of anomalies rather than merely detecting if an image is anomalous. While MVTec-AD focuses on binary detection and suffers from mislabeled or ambiguous samples, MVTec-AC introduces manually corrected labels and reorganized anomaly categories to enable robust multi-class evaluation. Key improvements include the correction of 36 misclassified samples, merging of 4 overlapping classes, removal of 4 ambiguous ‘combined’ classes, and exclusion of the toothbrush category, which contains only a single trivial anomaly type. These changes support consistent, fine-grained assessment of classification models in industrial visual inspection contexts.

1 papers3 benchmarksImages

VisA-AC

VisA-AC is a refined benchmark based on the VisA dataset, tailored for the task of anomaly classification—distinguishing between different types of anomalies rather than simply detecting whether an image is anomalous. While the original VisA provides anomaly type information in an Excel file, it includes numerous under-sampled and ambiguous classes. VisA-AC addresses these issues by removing classes with fewer than 10 samples, merging visually similar categories, and manually correcting mislabeled samples. Additionally, anomaly classes in VisA-AC are organized into separate folders—following the structure of MVTec-AC—for easier integration and usage. The resulting dataset ensures both statistical robustness and semantic clarity, supporting rigorous evaluation of multi-class anomaly classification methods in real-world industrial settings.

1 papers3 benchmarksImages

OpenS2V-5M

We create the first open-source large-scale S2V generation dataset OpenS2V-5M, which consists of five million high-quality 720P subject-text-video triples. To ensure subject-information diversity in our dataset by, we (1) segmenting subjects and building pairing information via cross-video associations and (2) prompting GPT-Image on raw frames to synthesize multi-view representations. The dataset supports both Subject-to-Video and Text-to-Video generation tasks.

1 papers0 benchmarksImages, Texts, Videos

ITDD (Industrial Textile Defect Detection)

The Industrial Textile Defect Detection (ITDD) dataset includes 1885 industrial textile images categorized into 4 categories: cotton fabric, dyed fabric, hemp fabric, and plaid fabric. These classes are collected from the industrial production sites of WEIQIAO Textile. ITDD is an upgraded version of WFDD that reorganizes three original classes and adds one new class.

1 papers2 benchmarksImages

WebGen-Bench

WebGen-Bench WebGen-Bench is created to benchmark LLM-based agent's ability to generate websites from scratch. The dataset is introduced in WebGen-Bench: Evaluating LLMs on Generating Interactive and Functional Websites from Scratch. It contains 101 instructions and 647 test cases. It also has a training set of 6667 instructions, named WebGen-Instruct.

1 papers0 benchmarksImages, Texts

Znaki

The first and the one open dataset for Russian finger- spelling, contained 1,593 annotated phrases and over 37 thousand HD+ videos.

1 papers1 benchmarksImages, Texts, Videos

VME & CDSI (Vehicles in the Middle East (VME) & Car Detection in Satellite Imagery (CDSI) datasets)

Vehicles in the Middle East (VME) dataset, designed explicitly for vehicle detection in high-resolution satellite images from Middle Eastern countries. Sourced from Maxar, the VME dataset spans 54 cities across 12 countries, comprising over 4,000 image tiles and more than 100,000 vehicles, annotated using both manual and semi-automated methods. Also, we introduce the largest benchmark dataset for Car Detection in Satellite Imagery (CDSI), combining images from multiple sources to enhance global car detection.

1 papers5 benchmarksImages

Matador (Matador: A Material Image Dataset)

The Matador dataset is a material image dataset with hierarchical labels. The hierarchical labels are derived from a new taxonomy. For each sample of a material, we collect a local appearance image, local surface structure LiDAR scan, global context image, and record any camera motion that takes place during the capture sequence. The dataset is intended to grow over time. To date, Matador contains 57 different material categories and a total of ~7,200 images, averaging 126 samples of intraclass variance.

1 papers0 benchmarksImages, LiDAR, RGB-D

ILSP Greek Evaluation Suite

A collection of test sets for evaluating base and chat LLMs (incl. VLMs) on Greek generation and understanding capabilities.

1 papers0 benchmarksImages, Texts

U2-Bench

U2-BENCH is the first large-scale benchmark for evaluating Large Vision-Language Models (LVLMs) on ultrasound imaging understanding. It provides a diverse, multi-task dataset curated from 40 licensed sources, covering 15 anatomical regions and 8 clinically inspired tasks across classification, detection, regression, and text generation.

1 papers0 benchmarksImages, Medical

CUTS

This is the dataset released along with the publication:

1 papers0 benchmarksImages

Upper body thermal images and associated clinical data from a pilot cohort study of COVID-19

The prospective upper body thermal images SARS-CoV2 association study was designed to test the hypothesis that thermal videos may aid in the early diagnosis of COVID-19. The study recorded a set of measurements from 252 participants regarding PCR results, demographics, vital signs, participant activities, medications, respiratory symptoms, and a thermal video session where the volunteers performed simple breath-hold in four different positions. The acquired data may be used to test clinical association questions regarding temperature patterns, demographics, and vital signs. Furthermore, it could be valuable to develop new computer algorithms for extracting useful scientific information from thermal videos.

1 papers0 benchmarksImages, Tabular
PreviousPage 150 of 164Next