TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Datasets

3,275 machine learning datasets

Filter by Modality

  • Images3,275
  • Texts3,148
  • Videos1,019
  • Audio486
  • Medical395
  • 3D383
  • Time series298
  • Graphs285
  • Tabular271
  • Speech199
  • RGB-D192
  • Environment148
  • Point cloud135
  • Biomedical123
  • LiDAR95
  • RGB Video87
  • Tracking78
  • Biology71
  • Actions68
  • 3d meshes65
  • Tables52
  • Music48
  • EEG45
  • Hyperspectral images45
  • Stereo44
  • MRI39
  • Physics32
  • Interactive29
  • Dialog25
  • Midi22
  • 6D17
  • Replay data11
  • Financial10
  • Ranking10
  • Cad9
  • fMRI7
  • Parallel6
  • Lyrics2
  • PSG2
Clear filter

3,275 dataset results

SECOND (SEmantic Change detectiON Dataset)

SECOND is a well-annotated semantic change detection dataset. To ensure data diversity, we firstly collect 4662 pairs of aerial images from several platforms and sensors. These pairs of images are distributed over the cities such as Hangzhou, Chengdu, and Shanghai. Each image has size 512 x 512 and is annotated at the pixel level. The annotation of SECOND is carried out by an expert group of earth vision applications, which guarantees high label accuracy. For the change category in the SECOND dataset, we focus on 6 main land-cover classes, i.e. , non-vegetated ground surface, tree, low vegetation, water, buildings and playgrounds , that are frequently involved in natural and man-made geographical changes. It is worth noticing that, in the new dataset, non-vegetated ground surface ( n.v.g. surface for short) mainly corresponds to impervious surface and bare land. In summary, these 6 selected land-cover categories result in 30 common change categories (including non-change ). Through the

17 papers3 benchmarksImages

ViViD++ (Vision for Visibility Dataset)

A dataset capturing diverse visual data formats that target varying luminance conditions, and was recorded from alternative vision sensors, by handheld or mounted on a car, repeatedly in the same space but in different conditions.

17 papers0 benchmarks3D, Images, LiDAR, RGB Video, RGB-D

Memento10k

Memorability dataset with 10000 3-second videos. Each video has upwards of 90 human annotations, and the split-half consistency of this dataset is 0.73 (best in class for video memorabilty datasets).

17 papers0 benchmarksImages, RGB Video, Videos

Laval Indoor HDR Dataset

This dataset contains 2100+ high resolution indoor panoramas, captured using a Canon 5D Mark III and a robotic panoramic tripod head. Each capture was multi-exposed (22 f-stops) and is fully HDR, without any saturation. Panoramas were stitched from 6 captures (60 degrees azimuth increment) and were captured in a wide variety of indoor environments.

17 papers0 benchmarksImages

QUILT-1M

Recent accelerations in multi-modal applications have been made possible with the plethora of image and text data available online. However, the scarcity of similar data in the medical field, specifically in histopathology, has halted similar progress. To enable similar representation learning for histopathology, we turn to YouTube, an untapped resource of videos, offering 1,087 hours of valuable educational histopathology videos from expert clinicians. From YouTube, we curate Quilt: a large-scale vision-language dataset consisting of 768,826 image and text pairs. Quilt was automatically curated using a mixture of models, including large language models), handcrafted algorithms, human knowledge databases, and automatic speech recognition. In comparison, the most comprehensive datasets curated for histopathology amass only around 200K samples. We combine Quilt with datasets, from other sources, including Twitter, research papers, and the internet in general, to create an even larger dat

17 papers0 benchmarksImages, Medical, Texts

NINCO (No ImageNet Class Objects)

The NINCO (No ImageNet Class Objects) dataset is introduced in the ICML 2023 paper In or Out? Fixing ImageNet Out-of-Distribution Detection Evaluation. The images in this dataset are free from objects that belong to any of the 1000 classes of ImageNet-1K (ILSVRC2012), which makes NINCO suitable for evaluating out-of-distribution detection on ImageNet-1K .

17 papers0 benchmarksImages

COCO-Noisy (Microsoft Common Objects in Context with 20% of Noisy Correspondence and 1K test data)

This dataset is based on MS COCO that have 20% of data randomly shuffled to simulate noisy correspondence.

17 papers21 benchmarksImages, Texts

MM-Vet v2

MM-Vet v2: A Challenging Benchmark to Evaluate Large Multimodal Models for Integrated Capabilities

17 papers4 benchmarksImages, Texts

NYU Hand

The NYU Hand pose dataset contains 8252 test-set and 72757 training-set frames of captured RGBD data with ground-truth hand-pose information. For each frame, the RGBD data from 3 Kinects is provided: a frontal view and 2 side views. The training set contains samples from a single user only (Jonathan Tompson), while the test set contains samples from two users (Murphy Stein and Jonathan Tompson). A synthetic re-creation (rendering) of the hand pose is also provided for each view.

16 papers0 benchmarksImages, Interactive

AgeDB

AgeDB contains 16, 488 images of various famous people, such as actors/actresses, writers, scientists, politicians, etc. Every image is annotated with respect to the identity, age and gender attribute. There exist a total of 568 distinct subjects. The average number of images per subject is 29. The minimum and maximum age is 1 and 101, respectively. The average age range for each subject is 50.3 years.

16 papers14 benchmarksImages

MinneApple

MinneApple is a benchmark dataset for apple detection and segmentation. The fruits are labelled using polygonal masks for each object instance to aid in precise object detection, localization, and segmentation. Additionally, the dataset also contains data for patch-based counting of clustered fruits. The dataset contains over 41, 000 annotated object instances in 1000 images.

16 papers0 benchmarksImages

SemArt

SemArt is a multi-modal dataset for semantic art understanding. SemArt is a collection of fine-art painting images in which each image is associated to a number of attributes and a textual artistic comment, such as those that appear in art catalogues or museum collections. It contains 21,384 samples that provides artistic comments along with fine-art paintings and their attributes for studying semantic art understanding.

16 papers0 benchmarksImages, Texts

SketchGraphs

SketchGraphs is a dataset of 15 million sketches extracted from real-world CAD models intended to facilitate research in both ML-aided design and geometric program induction. Each sketch is represented as a geometric constraint graph where edges denote designer-imposed geometric relationships between primitives, the nodes of the graph.

16 papers0 benchmarksGraphs, Images

SMHD (Self-reported Mental Health Diagnoses)

A novel large dataset of social media posts from users with one or multiple mental health conditions along with matched control users.

16 papers0 benchmarksImages, Texts

TextSeg

TextSeg is a large-scale fine-annotated and multi-purpose text detection and segmentation dataset, collecting scene and design text with six types of annotations: word- and character-wise bounding polygons, masks and transcriptions.

16 papers1 benchmarksImages

VehicleX

VehicleX is a large-scale synthetic dataset. Created in Unity, it contains 1,362 vehicles of various 3D models with fully editable attributes.

16 papers0 benchmarks3D, Images

CryoNuSeg

CryoNuSeg is a fully annotated FS-derived cryosectioned and H&E-stained nuclei instance segmentation dataset. The dataset contains images from 10 human organs that were not exploited in other publicly available datasets, and is provided with three manual mark-ups to allow measuring intra-observer and inter-observer variability.

16 papers0 benchmarksImages

DADA-2000

DADA-2000 is a large-scale benchmark with 2000 video sequences (named as DADA-2000) is contributed with laborious annotation for driver attention (fixation, saccade, focusing time), accident objects/intervals, as well as the accident categories, and superior performance to state-of-the-arts are provided by thorough evaluations.

16 papers0 benchmarksImages

BG-20k (Background Dataset - 20k)

BG-20k contains 20,000 high-resolution background images excluded salient objects, which can be used to help generate high quality synthetic data.

16 papers0 benchmarksImages

IDRiD (Indian Diabetic Retinopathy Image Dataset)

Indian Diabetic Retinopathy Image Dataset (IDRiD) dataset consists of typical diabetic retinopathy lesions and normal retinal structures annotated at a pixel level. This dataset also provides information on the disease severity of diabetic retinopathy and diabetic macular edema for each image. This dataset is perfect for the development and evaluation of image analysis algorithms for early detection of diabetic retinopathy.

16 papers6 benchmarksBiomedical, Images, Medical
PreviousPage 41 of 164Next