TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Datasets

87 machine learning datasets

Filter by Modality

  • Images3,275
  • Texts3,148
  • Videos1,019
  • Audio486
  • Medical395
  • 3D383
  • Time series298
  • Graphs285
  • Tabular271
  • Speech199
  • RGB-D192
  • Environment148
  • Point cloud135
  • Biomedical123
  • LiDAR95
  • RGB Video87
  • Tracking78
  • Biology71
  • Actions68
  • 3d meshes65
  • Tables52
  • Music48
  • EEG45
  • Hyperspectral images45
  • Stereo44
  • MRI39
  • Physics32
  • Interactive29
  • Dialog25
  • Midi22
  • 6D17
  • Replay data11
  • Financial10
  • Ranking10
  • Cad9
  • fMRI7
  • Parallel6
  • Lyrics2
  • PSG2
Clear filter

87 dataset results

CholecT40 (Cholecystectomy Action Triplet)

CholecT40 is the first endoscopic dataset introduced to enable research on fine-grained action recognition in laparoscopic surgery.

2 papers2 benchmarksImages, RGB Video, Videos

RLU (RL Unplugged)

RL Unplugged is suite of benchmarks for offline reinforcement learning. The RL Unplugged is designed around the following considerations: to facilitate ease of use, we provide the datasets with a unified API which makes it easy for the practitioner to work with all data in the suite once a general pipeline has been established. This is a dataset accompanying the paper RL Unplugged: Benchmarks for Offline Reinforcement Learning.

2 papers0 benchmarksActions, Environment, Images, Physics, RGB Video, Replay data

TI1K Dataset (Thumb Index 1000 Hand & Fingertip Detection Dataset)

Thumb Index 1000 (TI1K) is a dataset of 1000 hand images with the hand bounding box, and thumb and index fingertip positions. The dataset includes the natural movement of the thumb and index fingers making it suitable for mixed reality (MR) applications.

2 papers0 benchmarksActions, Environment, Images, RGB Video

SuperCaustics

SuperCaustics is a simulation tool made in Unreal Engine for generating massive computer vision datasets that include transparent objects.

2 papers0 benchmarksEnvironment, Images, Interactive, LiDAR, Physics, RGB Video, RGB-D

PHANTOM (Physical Anomalous Trajectory or Motion (PHANTOM))

To evaluate the presented approaches, we created the Physical Anomalous Trajectory or Motion (PHANTOM) dataset consisting of six classes featuring everyday objects or physical setups, and showing nine different kinds of anomalies. We designed our classes to evaluate detection of various modes of video abnormalities that are generally excluded in video AD settings.

2 papers6 benchmarksRGB Video

Dynamic OLAT Dataset (ShanghaiTech MARS Dynamic OLAT Dataset)

To provide ground truth supervision for video consistency modeling, we build up a high-quality dynamic OLAT dataset. Our capture system consists of a light stage setup with 114 LED light sources and Phantom Flex4K-GS camera (global shutter, stationary 4K ultra-high-speed camera at 1000 fps), resulting in dynamic OLAT imageset recording at 25 fps using the overlapping method. Our dynamic OLAT dataset provides sufficient semantic, temporal and lighting consistency supervision to train our neural video portrait relighting scheme, which can generalize to in-the-wild scenarios.

2 papers0 benchmarksImages, RGB Video, Videos

IBISCape

A Simulated Benchmark for multi-modal SLAM Systems Evaluation in Large-scale Dynamic Environments.

2 papers0 benchmarksEnvironment, Images, Point cloud, RGB Video, RGB-D, Stereo, Videos

FetReg: Largescale Multi-centre Fetoscopy Placenta Dataset

Fetoscopic Placental Vessel Segmentation and Registration (FetReg2021) challenge was organized as part of the MICCAI2021 Endoscopic Vision (EndoVis) challenge. Through FetReg2021 challenge, we released the first large-scale multi-centre dataset of fetoscopy laser photocoagulation procedure. The dataset contains 2,718 pixel-wise annotated images (for background, vessel, fetus, tool classes) from 24 different in vivo TTTS fetoscopic surgeries and 24 unannotated video clips video clips containing 9,616 frames for training and testing. The dataset is useful for the development of generalized and robust semantic segmentation and video mosaicking algorithms for long duration fetoscopy videos.

2 papers0 benchmarksBiomedical, Images, RGB Video

PUMaVOS (Partial and Unusual Masks for Video Object Segmentation)

PUMaVOS is a dataset of challenging and practical use cases inspired by the movie production industry.

2 papers0 benchmarksImages, RGB Video

SB20 (Sugar Beet 2020 University of Bonn)

Video sequences captured at a field on Campus Kleinaltendorf (CKA), University of Bonn, captured by BonBot-I, an autonomous weeding robot. The data was captured by mounting an Intel RealSense D435i sensor with a nadir view of the ground.

2 papers0 benchmarksImages, RGB Video, RGB-D, Videos

3DYoga90 (3DYoga90: A Hierarchical Video Dataset for Yoga Pose Understanding)

3DYoga90 is organized within a three-level label hierarchy. It stands out as one of the most comprehensive open datasets, featuring the largest collection of RGB videos and 3D skeleton sequences among publicly available resources.

2 papers0 benchmarks3D, Actions, RGB Video, Videos

LuViRA (Lund University Vision, Radio, and Audio)

The Lund University Vision, Radio, and Audio (LuViRA) positioning dataset consists of 89 trajectories that are recorded in the Lund University Humanities Lab's Motion Capture (Mocap) Studio using a MIR200 robot as the targeted platform. Each trajectory contains data from four different systems, vision, radio, audio and a ground truth system that can provide within 0.5mm localization accuracy. A Motion Capture (Mocap) system in the environment is used as the ground truth system, which provides 3D or 6DoF tracking of a camera, a single antenna and a speaker. These targets are mounted on top of the MIR200 robot and put in motion. 3D positions of the 11 static microphones are also provided.

2 papers0 benchmarksAudio, RGB Video

Aria Digital Twin Dataset

A real-world dataset, with hyper-accurate digital counterpart & comprehensive ground-truth annotation.

2 papers6 benchmarks3D, 3d meshes, Point cloud, RGB Video, Videos

RHM (Rhm: Robot house multi-view human activity recognition dataset)

The Robot House Multi-View dataset (RHM) contains four views: Front, Back, Ceiling, and Robot Views. There are 14 classes with 6701 video clips for each view, making a total of 26804 video clips for the four views. The lengths of the video clips are between 1 to 5 seconds. The videos with the same number and the same classes are synchronized in different views.

2 papers3 benchmarksActions, Images, RGB Video, Videos

LoTE-Animal (LoTE-Animal: A Long Time-span Dataset for Endangered Animal Behavior Understanding)

Understanding and analyzing animal behavior is increasingly essential to protect endangered animal species. However, the application of advanced computer vision techniques in this regard is minimal, which boils down to lacking large and diverse datasets for training deep models.

2 papers2 benchmarksImages, RGB Video, Videos

JHU CoSTAR Block Stacking Dataset

Involves data where a robot interacts with 5.1 cm colored blocks to complete an order-fulfillment style block stacking task. It contains dynamic scenes and real time-series data in a less constrained environment than comparable datasets. There are nearly 12,000 stacking attempts and over 2 million frames of real data.

1 papers0 benchmarks3D, Images, Point cloud, RGB Video, RGB-D

SoccerTrack Dataset

The SoccerTrack dataset comprises top-view and wide-view video footage annotated with bounding boxes. GNSS coordinates of each player are also provided. We hope that the SoccerTrack dataset will help advance the state of the art in multi-object tracking, especially in team sports.

1 papers0 benchmarksRGB Video, Tracking, Videos

Baxter-UR5_95-Objects

In this dataset two robots, Baxter and UR5, perform 8 behaviors (look, grasp, pick, hold, shake, lower, drop, and push) on 95 objects that vary by 5 color (blue, green, red, white, and yellow), 6 contents (wooden button, plastic dices, glass marbles, nuts & bolts, pasta, and rice), and 4 weights (empty, 50g, 100g, and 150g). There are 90 objects with contents (5 colors x 3 weights x 6 contents) and 5 objects without any content that only vary by 5 colors. Both robots perform 5 trials on each object, resulting in 7,600 interactions (2 robots x 8 behaviors x 95 objects x 5 trials

1 papers0 benchmarksActions, Audio, Images, Interactive, RGB Video, RGB-D, Time series, Videos

WMT-SLT

We provide separate training, development and test data. The training data is available right away. The development and test data will be released in several stages, starting with a release of the development sources only.

1 papers0 benchmarksRGB Video, Texts

ConsInv Dataset

ConsInv is a stereo RGB + IMU dataset designed for Dynamic SLAM testing and contains two subsets:

1 papers0 benchmarksImages, RGB Video, Stereo
PreviousPage 3 of 5Next