TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Datasets

192 machine learning datasets

Filter by Modality

  • Images3,275
  • Texts3,148
  • Videos1,019
  • Audio486
  • Medical395
  • 3D383
  • Time series298
  • Graphs285
  • Tabular271
  • Speech199
  • RGB-D192
  • Environment148
  • Point cloud135
  • Biomedical123
  • LiDAR95
  • RGB Video87
  • Tracking78
  • Biology71
  • Actions68
  • 3d meshes65
  • Tables52
  • Music48
  • EEG45
  • Hyperspectral images45
  • Stereo44
  • MRI39
  • Physics32
  • Interactive29
  • Dialog25
  • Midi22
  • 6D17
  • Replay data11
  • Financial10
  • Ranking10
  • Cad9
  • fMRI7
  • Parallel6
  • Lyrics2
  • PSG2
Clear filter

192 dataset results

Mila Simulated Floods

Mila Simulated Floods Dataset is a 1.5 square km virtual world using the Unity3D game engine including urban, suburban and rural areas.

2 papers2 benchmarksImages, RGB-D

SuperCaustics

SuperCaustics is a simulation tool made in Unreal Engine for generating massive computer vision datasets that include transparent objects.

2 papers0 benchmarksEnvironment, Images, Interactive, LiDAR, Physics, RGB Video, RGB-D

3D Datasets of Broccoli in the Field

This work was undertaken by members of the Lincoln Centre for Autonomous Systems, University of Lincoln, UK. The four data collection sessions were conducted at three different sites in Lincolnshire, UK and one in Murcia, Spain (see Fig. 1). The sessions were conducted at the beginning and towards the end of harvesting season in UK and at the end of the harvest in Spain. The variety of broccoli plants grown in UK is called Iron Man whilst the variety grown in Spain is called Titanium.The weather during UK data capture included a mixture of different conditions including sunny, overcast and raining with broccoli varying in maturity levels from small to larger to already harvested, while the conditions for data capture in Spain included strong sunlight and mature plants at the very end of the harvesting season. The tractor was driven through the broccoli field at a slow walking speed with two rows of broccoli plants being imaged by the RGB-D sensor.

2 papers0 benchmarksBiology, Images, RGB-D

TorWIC (The Toronto Warehouse Incremental Change Dataset)

TorWIC is the dataset discussed in POCD: Probabilistic Object-Level Change Detection and Volumetric Mapping in Semi-Static Scenes. The purpose of this dataset is to evaluate the map mainteneance capabilities in a warehouse environment undergoing incremental changes. This dataset is collected in a Clearpath Robotics facility.

2 papers0 benchmarksLiDAR, RGB-D

DRACO20K

DRACO20K dataset is used for evaluating object canonicalization on methods that estimate a canonical frame from a monocular input image.

2 papers0 benchmarks3D, Images, RGB-D

TBBR (Thermal Bridges on Building Rooftops)

The dataset of Thermal Bridges on Building Rooftops (TBBR dataset) consists of annotated combined RGB and thermal drone images with a height map. All images were converted to a uniform format of 3000$\times$4000 pixels, aligned, and cropped to 2400$\times$3400 to remove empty borders.

2 papers6 benchmarksHyperspectral images, Images, RGB-D

IBISCape

A Simulated Benchmark for multi-modal SLAM Systems Evaluation in Large-scale Dynamic Environments.

2 papers0 benchmarksEnvironment, Images, Point cloud, RGB Video, RGB-D, Stereo, Videos

Plittersdorf

A set of 221 stereo videos captured by the SOCRATES stereo camera trap in a wildlife park in Bonn, Germany between February and July of 2022. A subset of frames is labeled with instance annotations in the COCO format.

2 papers0 benchmarksImages, RGB-D, Videos

RGBD1K (A Large-scale Dataset and Benchmark for RGB-D Object Tracking)

RGBD1K is a benchmark for RGB-D Object Tracking which contains 1050 sequences with about 2.5M frames in total.

2 papers0 benchmarksRGB-D, Videos

Lindenthal Camera Traps

This data set contains 775 video sequences, captured in the wildlife park Lindenthal (Cologne, Germany) as part of the AMMOD project, using an Intel RealSense D435 stereo camera. In addition to color and infrared images, the D435 is able to infer the distance (or “depth”) to objects in the scene using stereo vision. Observed animals include various birds (at daytime) and mammals such as deer, goats, sheep, donkeys, and foxes (primarily at nighttime). A subset of 412 images is annotated with a total of 1038 individual animal annotations, including instance masks, bounding boxes, class labels, and corresponding track IDs to identify the same individual over the entire video.

2 papers0 benchmarksImages, RGB-D, Stereo, Videos

DTTD

Digital-Twin Tracking Dataset (DTTD) is a novel RGB-D dataset to enable further research of the problem and extend potential solutions towards longer ranges and mm localization accuracy. In total, 103 scenes of 10 common off-the-shelf objects with rich textures are recorded, with each frame annotated with a per-pixel semantic segmentation and ground-truth object poses provided by a commercial motion capturing system.

2 papers0 benchmarksRGB-D

ContactArt

ContactArt is a dataset for learning hand-object interaction priors for hand and articulated object pose estimation. The dataset is created using visual teleoperation, where the human operator can directly play within a physical simulator to manipulate the articulated objects. All the object models are from Partnet dataset for the convenience of scaling up. ContactArt can provide accurate annotation, rich hand-object interaction, and contact information.

2 papers0 benchmarks3D, RGB-D

Drunkard's Dataset

Estimating camera motion in deformable scenes poses a complex and open research challenge. Most existing non-rigid structure from motion techniques assume to observe also static scene parts besides deforming scene parts in order to establish an anchoring reference. However, this assumption does not hold true in certain relevant application cases such as endoscopies. To tackle this issue with a common benchmark, we introduce the Drunkard’s Dataset, a challenging collection of synthetic data targeting visual navigation and reconstruction in deformable environments. This dataset is the first large set of exploratory camera trajectories with ground truth inside 3D scenes where every surface exhibits non-rigid deformations over time. Simulations in realistic 3D buildings lets us obtain a vast amount of data and ground truth labels, including camera poses, RGB images and depth, optical flow and normal maps at high resolution and quality.

2 papers0 benchmarks3D, Medical, RGB-D, Videos

SB20 (Sugar Beet 2020 University of Bonn)

Video sequences captured at a field on Campus Kleinaltendorf (CKA), University of Bonn, captured by BonBot-I, an autonomous weeding robot. The data was captured by mounting an Intel RealSense D435i sensor with a nadir view of the ground.

2 papers0 benchmarksImages, RGB Video, RGB-D, Videos

3D-Point Cloud dataset of various geometrical terrains (3D-Point Cloud dataset of various geometrical terrains in urban environments recorded during human locomotion)

Depth vision has been recently used in many locomotion devices with the objective to ease the life of disabled people toward reaching more ecological lifestyle. This is due to the fact that such cameras are cheap, compact and can provide rich information about the environment. Our dataset provides many recordings of point cloud and other types of data during different locomotion modes in urban context. If you used this data, please cite the following papers below: 1-Depth Vision based Terrain Detection Algorithm during Human Locomotion 2-Using Depth Vision for Terrain Detection during Active Locomotion

2 papers0 benchmarks3D, Images, Point cloud, RGB-D

InSpaceType (Indoor Space Type Dataset for Monocular Depth Analysis)

High Quality Indoor Monocular Depth Estimation Dataset with focus on performance variation across space type

2 papers0 benchmarks3D, Images, RGB-D

HOI-Synth (HOI-Synth benchmark)

The HOI-Synth benchmark extends three egocentric datasets designed to study hand-object interaction detection, EPIC-KITCHENS VISOR, EgoHOS, and ENIGMA-51, with automatically labeled synthetic data obtained through a novel HOI generation pipeline.

2 papers0 benchmarksImages, RGB-D

WeatherKITTI

WeatherKITTI is currently the most realistic all-weather simulated enhancement of the KITTI dataset. The WeatherKITTI dataset simulates the three weather conditions that most affect visual perception in real-world scenarios: rain, snow, and fog. Each type of weather has two intensity levels: severe and extremely severe. Together with clear weather, these two levels create a weather-enhanced dataset featuring three levels and seven weather scenarios.

2 papers0 benchmarksImages, LiDAR, RGB-D

BASEPROD (The Bardenas Semi-Desert Planetary Rover Dataset)

BASEPROD provides comprehensive rover sensor data collected over a 1.7 km traverse, accompanied by high-resolution 2D and 3D drone maps of the terrain. The dataset also includes laser-induced breakdown spectroscopy (LIBS) measurements from key sampling sites along the rover's path, as well as weather station data to contextualize environmental conditions.

2 papers0 benchmarks3D, Environment, Images, Point cloud, RGB-D, Stereo, Tabular, Time series

SceneNet RGB-D

SceneNet-RGBD is a synthetic dataset containing large-scale photorealistic renderings of indoor scene trajectories with pixel-level annotations. Random sampling permits virtually unlimited scene configurations, and the dataset creators provide a set of 5M rendered RGB-D images from over 15K trajectories in synthetic layouts with random but physically simulated object poses. Each layout also has random lighting, camera trajectories, and textures. The scale of this dataset is well suited for pre-training data-driven computer vision techniques from scratch with RGB-D inputs, which previously has been limited by relatively small labelled datasets in NYUv2 and SUN RGB-D. It also provides a basis for investigating 3D scene labelling tasks by providing perfect camera poses and depth data as proxy for a SLAM system.

1 papers0 benchmarksImages, RGB-D
PreviousPage 6 of 10Next