TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Datasets

1,019 machine learning datasets

Filter by Modality

  • Images3,275
  • Texts3,148
  • Videos1,019
  • Audio486
  • Medical395
  • 3D383
  • Time series298
  • Graphs285
  • Tabular271
  • Speech199
  • RGB-D192
  • Environment148
  • Point cloud135
  • Biomedical123
  • LiDAR95
  • RGB Video87
  • Tracking78
  • Biology71
  • Actions68
  • 3d meshes65
  • Tables52
  • Music48
  • EEG45
  • Hyperspectral images45
  • Stereo44
  • MRI39
  • Physics32
  • Interactive29
  • Dialog25
  • Midi22
  • 6D17
  • Replay data11
  • Financial10
  • Ranking10
  • Cad9
  • fMRI7
  • Parallel6
  • Lyrics2
  • PSG2
Clear filter

1,019 dataset results

Metaphorics

Metaphorics is a newly introduced non-contextual skeleton action dataset. All the datasets introduced so far in the skeleton human action recognition have categories based only on verb-based actions.

1 papers0 benchmarksVideos

MIDAS-KIKI

Consists of manually annotated dangerous and non-dangerous Kiki challenge videos.

1 papers0 benchmarksVideos

Mouse Reach

A large, annotated video dataset of mice performing a sequence of actions. The dataset was collected and labeled by experts for the purpose of neuroscience research.

1 papers0 benchmarksVideos

MVS1K

Contains about 1, 000 videos from 10 queries and their video tags, manual annotations, and associated web images.

1 papers0 benchmarksVideos

SFU-Store-Nav

A dataset collected in a set of experiments that involves human participants and a robot.

1 papers0 benchmarksVideos

STAIR Actions Captions

A large-scale Japanese video caption dataset consisting of 79,822 videos and 399,233 captions. Each caption in the dataset describes a video in the form of "who does what and where."

1 papers0 benchmarksTexts, Videos

Surveillance Camera Fight Dataset

The dataset is collected from the Youtube videos that contains fight instances in it. Also, some non-fight sequences from regular surveillance camera videos are included. * There are 300 videos in total as 150 fight + 150 non-fight * Videos are 2-second long * Only the fight related parts are included in the samples

1 papers0 benchmarksVideos

TREK-100

The dataset is composed of 100 video sequences densely annotated with 60K bounding boxes, 17 sequence attributes, 13 action verb attributes and 29 target object attributes.

1 papers0 benchmarksVideos

UG^2

Contains three difficult real-world scenarios: uncontrolled videos taken by UAVs and manned gliders, as well as controlled videos taken on the ground. Over 160,000 annotated frames forhundreds of ImageNet classes are available, which are used for baseline experiments that assess the impact of known and unknown image artifacts and other conditions on common deep learning-based object classification approaches.

1 papers0 benchmarksVideos

Simulated Flying Shapes

The dataset consists of 90 000 grayscale videos that show two objects of equal shape and size in which one object approaches the other one. The object speed during the process of approaching is hereby modelled by a proportional-derivative controller. Overall, three different shapes (Rectangle, Triangle and Circle) are provided. Initial configuration of the objects such as position and color were randomly sampled. Different from the moving MNIST dataset, the samples comprise a goal-oriented task, namely one object has to fully cover the other object rather than randomly moving, making it better suitable for testing prediction capabilities of an ML model. For instance, one can use it as a toy dataset to investigate the capacity and output behavior of a deep neural network before testing it on real-world data.

1 papers0 benchmarksVideos

i3-video (is-it-instructional-video)

The i3-video dataset contains "is-it-instructional" annotations for 6.4k videos from Youtube-8M. The videos are considered to be instructional if they focus on real-world human actions accompanied by procedural language that explains what’s happening on screen in reasonable details.

1 papers0 benchmarksVideos

IISc VINE (Indian Institute of Science VIdeo Naturalness Evaluation)

Indian Institute of Science VIdeo Naturalness Evaluation (IISc VINE) is a database consisting of 300 videos, obtained by applying different prediction models on different datasets, and accompanying human opinion scores.

1 papers0 benchmarksVideos

CASR (Cyclist Arm Signal Recognition)

CASR is a dataset for cyclist arm signal recognition in videos. It contains 219 annotated arm signal actions on videos of approximately 10 seconds each, containing one or two actions per video.

1 papers0 benchmarksVideos

Driving Event Camera Dataset

This dataset consists of a number of sequences that were recorded with a VGA (640x480) event camera (Samsung DVS Gen3) and a conventional RGB camera (Huawei P20 Pro) placed on the windshield of a car driving through Zurich.

1 papers0 benchmarksVideos

MIRACL-VC1

MIRACL-VC1 is a lip-reading dataset including both depth and color images. It can be used for diverse research fields like visual speech recognition, face detection, and biometrics. Fifteen speakers (five men and ten women) positioned in the frustum of an MS Kinect sensor and utter ten times a set of ten words and ten phrases (see the table below). Each instance of the dataset consists of a synchronized sequence of color and depth images (both of 640x480 pixels). The MIRACL-VC1 dataset contains a total number of 3000 instances.

1 papers1 benchmarksVideos

Sintel 4D LFV (Sintel 4D Light Field Video Dataset)

A medium-scale synthetic 4D Light Field video dataset for depth (disparity) estimation. From the open-source movie Sintel. The dataset consists of 24 synthetic 4D LFVs with 1,204x436 pixels, 9x9 views, and 20–50 frames, and has ground-truth disparity values, so that can be used for training deep learning-based methods. Each scene was rendered with a clean pass after modifying the production file of Sintel with reference to the MPI Sintel dataset.

1 papers0 benchmarksVideos

Bee4Exp Honeybee Detection

A dataset for flying honeybee detection introduced in "A Method for Detection of Small Moving Objects in UAV Videos".

1 papers6 benchmarksEnvironment, Videos

SARA motion (Synthetic Actors and Real Actions)

Sara motion is a 3D motion dataset, named Synthetic Actors and Real Actions (SARA), for training a model to produce motion embeddings suitable for reasoning about motion similarity.

1 papers0 benchmarks3D, Videos

NTU RGB+D 120 motion similarity

Motion similarity annotations for NTU RGB+D 120 dataset to evaluate motion similarity in the real world.

1 papers0 benchmarksImages, Videos

POTUS Corpus

The POTUS Corpus is a Database of Weekly Addresses for the Study of Stance in Politics and Virtual Agents.

1 papers0 benchmarksAudio, Videos
PreviousPage 38 of 51Next