TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Datasets

19,997 machine learning datasets

Filter by Modality

  • Images3,275
  • Texts3,148
  • Videos1,019
  • Audio486
  • Medical395
  • 3D383
  • Time series298
  • Graphs285
  • Tabular271
  • Speech199
  • RGB-D192
  • Environment148
  • Point cloud135
  • Biomedical123
  • LiDAR95
  • RGB Video87
  • Tracking78
  • Biology71
  • Actions68
  • 3d meshes65
  • Tables52
  • Music48
  • EEG45
  • Hyperspectral images45
  • Stereo44
  • MRI39
  • Physics32
  • Interactive29
  • Dialog25
  • Midi22
  • 6D17
  • Replay data11
  • Financial10
  • Ranking10
  • Cad9
  • fMRI7
  • Parallel6
  • Lyrics2
  • PSG2

19,997 dataset results

COVIDx (COVIDx CRX-2)

An open access benchmark dataset comprising of 13,975 CXR images across 13,870 patient cases, with the largest number of publicly available COVID-19 positive cases to the best of the authors' knowledge.

96 papers2 benchmarks

RAVEN

RAVEN consists of 1,120,000 images and 70,000 RPM (Raven's Progressive Matrices) problems, equally distributed in 7 distinct figure configurations.

96 papers0 benchmarksImages, Texts

UAVDT (Unmanned Aerial Vehicle Benchmark Object Detection and Tracking)

UAVDT is a large scale challenging UAV Detection and Tracking benchmark (i.e., about 80, 000 representative frames from 10 hours raw videos) for 3 important fundamental tasks, i.e., object DETection (DET), Single Object Tracking (SOT) and Multiple Object Tracking (MOT).

96 papers9 benchmarksVideos

HM3D (Habitat-Matterport 3D)

Habitat-Matterport 3D (HM3D) is a large-scale dataset of 1,000 building-scale 3D reconstructions from a diverse set of real-world locations. Each scene in the dataset consists of a textured 3D mesh reconstruction of interiors such as multi-floor residences, stores, and other private indoor spaces.

96 papers0 benchmarks3D

Kinetics-700

Kinetics-700 is a video dataset of 650,000 clips that covers 700 human action classes. The videos include human-object interactions such as playing instruments, as well as human-human interactions such as shaking hands and hugging. Each action class has at least 700 video clips. Each clip is annotated with an action class and lasts approximately 10 seconds.

95 papers7 benchmarksVideos

Math23K (Math23K for Math Word Problem Solving)

Math23K is a dataset created for math word problem solving, contains 23, 162 Chinese problems crawled from the Internet. Refer to our paper for more details: The dataset is originally introduced in the paper Deep Neural Solver for Math Word Problems. The original files are originally split into train/test split, while other research efforts (https://github.com/2003pro/Graph2Tree) perform the train/dev/test split.

95 papers12 benchmarksTexts

HIDE

Consists of 8,422 blurry and sharp image pairs with 65,784 densely annotated FG human bounding boxes.

95 papers11 benchmarksImages

T-LESS

T-LESS is a dataset for estimating the 6D pose, i.e. translation and rotation, of texture-less rigid objects. The dataset features thirty industry-relevant objects with no significant texture and no discriminative color or reflectance properties. The objects exhibit symmetries and mutual similarities in shape and/or size. Compared to other datasets, a unique property is that some of the objects are parts of others. The dataset includes training and test images that were captured with three synchronized sensors, specifically a structured-light and a time-of-flight RGB-D sensor and a high-resolution RGB camera. There are approximately 39K training and 10K test images from each sensor. Additionally, two types of 3D models are provided for each object, i.e. a manually created CAD model and a semi-automatically reconstructed one. Training images depict individual objects against a black background. Test images originate from twenty test scenes having varying complexity, which increases from

94 papers6 benchmarks3D, Images, RGB-D

Sleep-EDF (Sleep-EDF Expanded)

The sleep-edf database contains 197 whole-night PolySomnoGraphic sleep recordings, containing EEG, EOG, chin EMG, and event markers. Some records also contain respiration and body temperature. Corresponding hypnograms (sleep patterns) were manually scored by well-trained technicians according to the Rechtschaffen and Kales manual, and are also available.

94 papers9 benchmarksAudio, EEG, Medical

VGG Face

The VGG Face dataset is face identity recognition dataset that consists of 2,622 identities. It contains over 2.6 million images.

94 papers0 benchmarksImages

DexYCB

DexYCB is a dataset for capturing hand grasping of objects. It can be used three relevant tasks: 2D object and keypoint detection, 6D object pose estimation, and 3D hand pose estimation.

94 papers59 benchmarksVideos

NC4K

As far as we know, there only exists one large camouflaged object testing dataset, the COD10K, while the sizes of other testing datasets are less than 300. We then contribute another camouflaged object testing dataset, namely NC4K, which includes 4,121 images downloaded from the Internet. The new testing dataset can be used to evaluate the generalization ability of existing models.

94 papers21 benchmarksImages

JAFFE (Japanese Female Facial Expression)

The JAFFE dataset consists of 213 images of different facial expressions from 10 different Japanese female subjects. Each subject was asked to do 7 facial expressions (6 basic facial expressions and neutral) and the images were annotated with average semantic ratings on each facial expression by 60 annotators.

93 papers9 benchmarksImages

SUN360 (Scene UNderstanding 360° panorama)

The goal of the SUN360 panorama database is to provide academic researchers in computer vision, computer graphics and computational photography, cognition and neuroscience, human perception, machine learning and data mining, with a comprehensive collection of annotated panoramas covering 360x180-degree full view for a large variety of environmental scenes, places and the objects within. To build the core of the dataset, the authors download a huge number of high-resolution panorama images from the Internet, and group them into different place categories. Then, they designed a WebGL annotation tool for annotating the polygons and cuboids for objects in the scene.

93 papers3 benchmarksImages

CUHK-PEDES

The CUHK-PEDES dataset is a caption-annotated pedestrian dataset. It contains 40,206 images over 13,003 persons. Images are collected from five existing person re-identification datasets, CUHK03, Market-1501, SSM, VIPER, and CUHK01 while each image is annotated with 2 text descriptions by crowd-sourcing workers. Sentences incorporate rich details about person appearances, actions, poses.

93 papers15 benchmarksImages, Texts

Aachen Day-Night

Aachen Day-Night is a dataset designed for benchmarking 6DOF outdoor visual localization in changing conditions. It focuses on localizing high-quality night-time images against a day-time 3D model. There are 14,607 images with changing conditions of weather, season and day-night cycles.

93 papers0 benchmarks3D, Images

xView

xView is one of the largest publicly available datasets of overhead imagery. It contains images from complex scenes around the world, annotated using bounding boxes. It contains over 1M object instances from 60 different classes.

93 papers5 benchmarksImages

VOC 2012 (The PASCAL Visual Object Classes Challenge 2012)

see detailed use case on code implementation of the paper 'Tell Me Where To Look: Guided Attention Inference Networks'

93 papers0 benchmarksImages

UCSD Ped2 (UCSD Anomaly Detection Dataset)

The UCSD Anomaly Detection Dataset was acquired with a stationary camera mounted at an elevation, overlooking pedestrian walkways. The crowd density in the walkways was variable, ranging from sparse to very crowded. In the normal setting, the video contains only pedestrians. Abnormal events are due to either: the circulation of non pedestrian entities in the walkways anomalous pedestrian motion patterns Commonly occurring anomalies include bikers, skaters, small carts, and people walking across a walkway or in the grass that surrounds it. A few instances of people in wheelchair were also recorded. All abnormalities are naturally occurring, i.e. they were not staged for the purposes of assembling the dataset. The data was split into 2 subsets, each corresponding to a different scene. The video footage recorded from each scene was split into various clips of around 200 frames.

92 papers5 benchmarksImages, Videos

Hopkins155

The Hopkins 155 dataset consists of 156 video sequences of two or three motions. Each video sequence motion corresponds to a low-dimensional subspace. There are 39−550 data vectors drawn from two or three motions for each video sequence.

92 papers1 benchmarksVideos
PreviousPage 36 of 1000Next