TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Datasets

3,275 machine learning datasets

Filter by Modality

  • Images3,275
  • Texts3,148
  • Videos1,019
  • Audio486
  • Medical395
  • 3D383
  • Time series298
  • Graphs285
  • Tabular271
  • Speech199
  • RGB-D192
  • Environment148
  • Point cloud135
  • Biomedical123
  • LiDAR95
  • RGB Video87
  • Tracking78
  • Biology71
  • Actions68
  • 3d meshes65
  • Tables52
  • Music48
  • EEG45
  • Hyperspectral images45
  • Stereo44
  • MRI39
  • Physics32
  • Interactive29
  • Dialog25
  • Midi22
  • 6D17
  • Replay data11
  • Financial10
  • Ranking10
  • Cad9
  • fMRI7
  • Parallel6
  • Lyrics2
  • PSG2
Clear filter

3,275 dataset results

Slim

This dataset consists of virtual scenes rendered in MuJoCo with multiple views each presented in multiple modalities: image, and synthetic or natural language descriptions. Each scene consists of two or three objects placed on a square walled room, and for each of the 10 camera viewpoint the authors rendered a 3D view of the scene as seen from that viewpoint as well as a synthetically generated description of the scene.

1 papers0 benchmarksImages, Texts

ARIA (Automated Retinal Image Analysis (ARIA) Data Set)

This data set was collected in 2004 to 2006 in the United Kingdom. Subjects were adult males and females, some of whom were healthy (control group), some with age-related macular degeneration (AMD group), and some were diabetic patients (diabetic group). Unfortunately, no other information from this time exists about this subjects.

1 papers0 benchmarksImages

CLOUD (CLOUD Dataset)

The CLOUD dataset is a set of Optical Coherence Tomography of the Anterior Segment images (AS-OCT) used to the automatic identification and representation of the cornea-contact lens relationship. The dataset includes 112 AS-OCT images that were captured from 16 different patients. In particular, the images were obtained by an OCT Cirrus 500 scanner model of Carl Zeiss Meditec with an anterior segment module for users of scleral contact lens (SCL).

1 papers0 benchmarksImages, Medical

UDA-CH (Unsupervised Domain Adaptation on Cultural Heritage)

UDA-CH contains 16 objects that cover a variety of artworks which can be found in a museum like sculptures, paintings and books. Specifically, the dataset has been collected inside the cultural site “Galleria Regionale di Palazzo Bellomo” located in Siracusa, Italy.

1 papers2 benchmarksImages

Combinatorial 3D Shape Dataset

The combinatorial 3D shape dataset is composed of 406 instances of 14 classes. Specifically, each object in the dataset is considered equivalent to a sequence of primitive placement.

1 papers0 benchmarks3D, Images

PixelShift200

Advanced pixel shift technology is employed to perform a full color sampling of the image. Pixel shift technology takes four samples of the same image at nearly the same time, and physically controls the camera sensor to move one pixel horizontally or vertically at each sampling to capture all color information at each pixel. The pixel shift technology ensures that the sampled images follow the distribution of natural images sampled by the camera, and the full information of the color (R, Gr, Gb, B channel) is completely obtained without any need of interpolation. In this way, the collected RGB images are artifacts-free, which leads to better training results for demosaicing related tasks.

1 papers0 benchmarksImages

Satire Dataset

The satire dataset is a new multi-modal dataset of satirical and regular news articles. The satirical news is collected from four websites that explicitly declare themselves to be satire, and the regular news is collected from six mainstream news websites. Specifically, the satirical news websites the articles were collected from are The Babylon Bee, Clickhole, Waterford Whisper News, and The DailyER. The regular news websites are Reuters, The Hill, Politico, New York Post, Huffington Post, and Vice News. The headlines and the thumbnail images of the latest 1000 articles for each of the publications are collected. The dataset contains a total of 4000 satirical and 6000 regular news articles.

1 papers0 benchmarksImages, Texts

SMOT (Single sequence-Multi Objects Training)

The SMOT dataset, Single sequence-Multi Objects Training, is collected to represent a practical scenario of collecting training images of new objects in the real world, i.e. a mobile robot with an RGB-D camera collects a sequence of frames while driving around a table to learning multiple objects and tries to recognize objects in different locations.

1 papers0 benchmarksImages

Indoor and outdoor DFD dataset

The dfd_indoor dataset contains 110 images for training and 29 images for testing. The dfd_outdoor dataset contains 34 images for tests; no ground truth was given for this dataset, as the depth sensor only works on indoor scenes.

1 papers0 benchmarksImages

MLGESTURE DATASET

MlGesture is a dataset for hand gesture recognition tasks, recorded in a car with 5 different sensor types at two different viewpoints. The dataset contains over 1300 hand gesture videos from 24 participants and features 9 different hand gesture symbols. One sensor cluster with five different cameras is mounted in front of the driver in the center of the dashboard. A second sensor cluster is mounted on the ceiling looking straight down.

1 papers0 benchmarksImages, Videos

SYNTHIA-PANO

SYNTHIA-PANO is the panoramic version of SYNTHIA dataset. Five sequences are included: Seqs02-summer, Seqs02-fall, Seqs04-summer, Seqs04-fall and Seqs05-summer. Panomaramic images with fine annotation for semantic segmentation.

1 papers0 benchmarksImages

Pesteh-Set

Pesteh-Set is made of two parts. The first part includes 423 images with ground truth. The pistachios are sorted into two classes: Open-mouth and closed-mouth. The ground truth of the images is a CSV file that consists of the bounding boxes of the two classes of pistachios in the images. There are between 1 to 27 pistachios in each image, and 3927 pistachios in total. The second part includes 6 videos with a total length of 167 seconds and 561 moving pistachios.

1 papers0 benchmarksImages

CelebAGaze

CelebAGaze consists of 25283 high-resolution celebrity images that are collected from CelebA and the Internet. It consists of 21832 face images with eyes staring at the camera and 3451 face images with eyes staring somewhere else. All images (256 × 256) are cropped and the eye mask region by dlib is computed. Specifically, dlib is used to extract 68 facial landmarks and calculate the mean of 6 points near the eye region, which will be the center point of the mask. The size of the mask is fixed to 30×50. As described above, 300 samples from domain Y are randomly selected, 100 samples from domain X as the test set, the remaining as the training set. Note that this dataset is unpaired and it is not labeled with the specific eye angle or the head pose information.

1 papers0 benchmarksImages

EXPO-HD

The EXPO-HD Dataset is a dataset of Expo whiteboard markers for the purpose of instance segmentation. The dataset contains two subsets (both include instances segmentation labels):

1 papers0 benchmarksImages

Short Text Font Dataset

The proposed dataset includes 1,309 short text instances from Adobe Spark. The dataset is a collection of publicly available sample texts created by different designers. It covers a variety of topics found in posters, flyers, motivational quotes and advertisements.

1 papers0 benchmarksImages, Texts

SPHERE-calorie

The dataset contains both RGB and depth images, and the data from two accelerometers, together with ground truth calorie values from a calorimeter for calorie expenditure estimation in home environments.

1 papers0 benchmarksImages, RGB-D, Time series

PolarRR

PolarRR is a new dataset with more than 100 types of glass in which obtained transmission images are perfectly aligned with input mixed images.

1 papers0 benchmarksImages

PVDN (Provident Vehicle Detection at Night)

PVDN is a dataset of vehicle detection at night, using light reflections caused by their headlamps. It contains 59,746 annotated grayscale images out of 346 different scenes in a rural environment at night. In these images, all oncoming vehicles, their corresponding light objects (e. g., headlamps), and their respective light reflections (e. g., light reflections on guardrails) are labeled. With this information, this dataset enables research into new methods of detecting oncoming vehicles based on the light reflections they cause, long before they are directly visible.

1 papers0 benchmarksImages

Doc3DShade

Doc3DShade extends Doc3D with realistic lighting and shading. Follows a similar synthetic rendering procedure using captured document 3D shapes but final image generation step combines real shading of different types of paper materials under numerous illumination conditions.

1 papers0 benchmarksImages

MessyTable

MessyTable features a large number of scenes with messy tables captured from multiple camera views. Each scene in this dataset is highly complex, containing multiple object instances that could be identical, stacked and occluded by other instances. The key challenge is to associate all instances given the RGB image of all views. The seemingly simple task surprisingly fails many popular methods or heuristics. The dataset challenges existing methods in mining subtle appearance differences, reasoning based on contexts, and fusing appearance with geometric cues for establishing an association.

1 papers0 benchmarksImages
PreviousPage 109 of 164Next