TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Datasets

3,275 machine learning datasets

Filter by Modality

  • Images3,275
  • Texts3,148
  • Videos1,019
  • Audio486
  • Medical395
  • 3D383
  • Time series298
  • Graphs285
  • Tabular271
  • Speech199
  • RGB-D192
  • Environment148
  • Point cloud135
  • Biomedical123
  • LiDAR95
  • RGB Video87
  • Tracking78
  • Biology71
  • Actions68
  • 3d meshes65
  • Tables52
  • Music48
  • EEG45
  • Hyperspectral images45
  • Stereo44
  • MRI39
  • Physics32
  • Interactive29
  • Dialog25
  • Midi22
  • 6D17
  • Replay data11
  • Financial10
  • Ranking10
  • Cad9
  • fMRI7
  • Parallel6
  • Lyrics2
  • PSG2
Clear filter

3,275 dataset results

WHOOPS!

WHOOPS! Is a dataset and benchmark for visual commonsense. The dataset is comprised of purposefully commonsense-defying images created by designers using publicly-available image generation tools like Midjourney. It contains commonsense-defying image from a wide range of reasons, deviations from expected social norms and everyday knowledge.

14 papers7 benchmarksImages, Texts

PGPS9K

A new large scale plane geometry problem solving dataset called PGPS9K, labeled both fine-grained diagram annotation and interpretable solution program.

14 papers1 benchmarksImages, Texts

SynthRAD2023

Purpose Medical imaging has become increasingly important in diagnosing and treating oncological patients, particularly in radiotherapy. Recent advances in synthetic computed tomography (sCT) generation have increased interest in public challenges to provide data and evaluation metrics for comparing different approaches openly. This paper describes a dataset of brain and pelvis computed tomography (CT) images with rigidly registered cone-beam CT (CBCT) and magnetic resonance imaging (MRI) images to facilitate the development and evaluation of sCT generation for radiotherapy planning.

14 papers0 benchmarks3D, Images, Medical

Open-Platypus

Open-Platypus is a family of fine-tuned and merged Large Language Models (LLMs) that achieves the strongest performance and currently stands at first place in HuggingFace's Open LLM Leaderboard.

14 papers0 benchmarksImages

MMPD (Multi-Domain Mobile Video Physiology Dataset)

The Multi-domain Mobile Video Physiology Dataset (MMPD), comprising 11 hours(1152K frames) of recordings from mobile phones of 33 subjects. The dataset was designed to capture videos with greater representation across skin tone, body motion, and lighting conditions. MMPD is comprehensive with eight descriptive labels and can be used in conjunction with the rPPG-toolbox and PhysBench. MMPD is widely used for rPPG tasks and remote heart rate estimation. To access the dataset, you are supposed to download this data release agreement and request downloading by email.

14 papers0 benchmarksImages, Medical, Time series, Videos

PolyMNIST

The dataset is based on the original MNIST dataset. Compared to the original dataset, the digits are scaled down by a factor of $0.75$ such that there is more space for the random translation.The PolyMNIST consists of 5 different modalities.

14 papers0 benchmarksImages

SIBR (SIBR Dataset for VIE in the Wild)

SIBR是面向自然场景视觉信息抽取的数据集。

14 papers1 benchmarksImages, Texts

Visual Madlibs

Visual Madlibs is a dataset consisting of 360,001 focused natural language descriptions for 10,738 images. This dataset is collected using automatically produced fill-in-the-blank templates designed to gather targeted descriptions about: people and objects, their appearances, activities, and interactions, as well as inferences about the general scene or its broader context.

13 papers0 benchmarksImages, Texts

VQA-CP

The VQA-CP dataset was constructed by reorganizing VQA v2 such that the correlation between the question type and correct answer differs in the training and test splits. For example, the most common answer to questions starting with What sport… is tennis in the training set, but skiing in the test set. A model that guesses an answer primarily from the question will perform poorly.

13 papers1 benchmarksImages, Texts

SEN12MS-CR

SEN12MS-CR is a multi-modal and mono-temporal data set for cloud removal. It contains observations covering 175 globally distributed Regions of Interest recorded in one of four seasons throughout the year of 2018. For each region, paired and co-registered synthetic aperture radar (SAR) Sentinel-1 measurements as well as cloudy and cloud-free optical multi-spectral Sentinel-2 observations from European Space Agency's Copernicus mission are provided. The Sentinel satellites provide public access data and are among the most prominent satellites in Earth observation.

13 papers8 benchmarksHyperspectral images, Images

Dayton

The Dayton dataset is a dataset for ground-to-aerial (or aerial-to-ground) image translation, or cross-view image synthesis. It contains images of road views and aerial views of roads. There are 76,048 images in total and the train/test split is 55,000/21,048. The images in the original dataset have 354×354 resolution.

13 papers0 benchmarksImages

Watch-n-Patch

The Watch-n-Patch dataset was created with the focus on modeling human activities, comprising multiple actions in a completely unsupervised setting. It is collected with Microsoft Kinect One sensor for a total length of about 230 minutes, divided in 458 videos. 7 subjects perform human daily activities in 8 offices and 5 kitchens with complex backgrounds. Moreover, skeleton data are provided as ground truth annotations.

13 papers0 benchmarksImages, Interactive

EgoDexter

The EgoDexter dataset provides both 2D and 3D pose annotations for 4 testing video sequences with 3190 frames. The videos are recorded with body-mounted camera from egocentric viewpoints and contain cluttered backgrounds, fast camera motion, and complex interactions with various objects. Fingertip positions were manually annotated for 1485 out of 3190 frames.

13 papers0 benchmarksImages, RGB-D, Videos

BraTS 2016

BRATS 2016 is a brain tumor segmentation dataset. It shares the same training set as BRATS 2015, which consists of 220 HHG and 54 LGG. Its testing dataset consists of 191 cases with unknown grades. Image Source: https://sites.google.com/site/braintumorsegmentation/home/brats_2016

13 papers0 benchmarksImages, MRI, Medical

Logo-2K+

Logo-2K+:A Large-Scale Logo Dataset for Scalable Logo Classification The Logo-2K+ dataset contains a diverse range of logo classes from real-world logo images. It contains 167,140 images with 10 root categories and 2,341 leaf categories. The 10 different root categories are: Food, Clothes, Institution, Accessories, Transportation, Electronic, Necessities, Cosmetic, Leisure and Medical.

13 papers0 benchmarksImages

TTPLA (Transmission Towers and Power Lines (TTPLA))

TTPLA is a public dataset which is a collection of aerial images on Transmission Towers (TTs) and Power Lines (PLs). It can be used for detection and segmentation of transmission towers and power lines. It consists of 1,100 images with the resolution of 3,840×2,160 pixels, as well as manually labelled 8,987 instances of TTs and PLs.

13 papers0 benchmarksImages

UFDD (Unconstrained Face Detection Dataset)

Unconstrained Face Detection Dataset (UFDD) aims to fuel further research in unconstrained face detection.

13 papers0 benchmarksImages

BrixIA (BrixIA Covid-19)

BrixIA Covid-19 is a large dataset of CXR images corresponding to the entire amount of images taken for both triage and patient monitoring in sub-intensive and intensive care units during one month (between March 4th and April 4th 2020) of pandemic peak at the ASST Spedali Civili di Brescia, and contains all the variability originating from a real clinical scenario. It includes 4,707 CXR images of COVID-19 subjects, acquired with both CR and DX modalities, in AP or PA projection, and retrieved from the facility RIS-PACS system.

13 papers0 benchmarksImages, Medical

PISC (People in Social Context)

The People in Social Context (PISC) dataset is a dataset that focuses on social relationships. It consists of 22,670 images of 9 types of social relationships. It has annotations for the bounding boxes of all people, as well as the social relationship between all pairs of people in the images. In addition, it also contains occupation annotation.

13 papers2 benchmarksImages

ecoset

Ecoset, an ecologically motivated image dataset, is a large-scale image dataset designed for human visual neuroscience, which consists of over 1.5 million images from 565 basic-level categories. Category selection was based on English nouns that most frequently occur in spoken language (estimated on a set of 51 million words obtained from American television and film subtitles) and concreteness ratings from human observers. Ecoset consists of basic-level categories (including human categories man, woman, and child) that describe physical things in the world (rather than abstract concepts) that are important to humans.

13 papers0 benchmarksImages
PreviousPage 45 of 164Next