Datasets

3,275 machine learning datasets

3,275 dataset results

S-COCO (Synthetic COCO)

Synthetic COCO (S-COCO) is a synthetically created dataset for homography estimation learning. It was introduced by DeTone et al., where the source and target images are generated by duplicating the same COCO image. The source patch $I_S$ is generated by randomly cropping a source candidate at position $p$ with a size of 128 ×128 pixels. Then the patch’s corners are randomly perturbed vertically and horizontally by values within the range [−$\rho$,$\rho$] and the four correspondences define a homography $H_{ST}$ . The inverse of this homography $H_{TS} = (H_{ST} )^{-1}$ is applied to the target candidate and from the resulted warped image a target patch $I_T$ is cropped at the same location p. Both $I_S$ and $I_T$ are the input data with the homography $H_{ST}$ as ground truth.

7 papers1 benchmarksImages

VQA-CE (VQA Counterexamples)

This dataset provides a new split of VQA v2 (similarly to VQA-CP v2), which is built of questions that are hard to answer for biased models.

7 papers1 benchmarksImages, Texts

Panoptic nuScenes

Panoptic nuScenes is a benchmark dataset that extends the popular nuScenes dataset with point-wise groundtruth annotations for semantic segmentation, panoptic segmentation, and panoptic tracking tasks.

7 papers0 benchmarksImages

SLAKE-English

English subset of the SLAKE dataset, comprising 642 images and more than 7,000 question–answer pairs.

7 papers0 benchmarksImages, Medical, Texts

BigDetection

BigDetection is a new large-scale benchmark to build more general and powerful object detection systems. It leverages the training data from existing datasets (LVIS, OpenImages and Object365) with carefully designed principles, and curate a larger dataset for improved detector pre-training. BigDetection dataset has 600 object categories and contains 3.4M training images with 36M object bounding boxes.

7 papers0 benchmarksImages

Bamboo

Bamboo Dataset is a mega-scale and information-dense dataset for both classification and detection pre-training. It is built upon integrating 24 public datasets (e.g. ImagenNet, Places365, Object365, OpenImages) and added new annotations through active learning. Bamboo has 69M image classification annotations and 32M object bounding boxes.

7 papers0 benchmarksImages

SEN12MS-CR-TS

SEN12MS-CR-TS is a multi-modal and multi-temporal data set for cloud removal. It contains time-series of paired and co-registered Sentinel-1 and cloudy as well as cloud-free Sentinel-2 data from European Space Agency's Copernicus mission. Each time series contains 30 cloudy and clear observations regularly sampled throughout the year 2018. Our multi-temporal data set is readily pre-processed and backward-compatible with SEN12MS-CR.

7 papers8 benchmarksHyperspectral images, Images, Time series

WALT (Watch and Learn TimeLapse Images)

We introduce a new dataset, Watch and Learn Time-lapse (WALT), consisting of multiple (4K and 1080p) cameras capturing urban environments over a year.

7 papers3 benchmarksImages, Time series, Tracking, Videos

ArtBench-10 (32x32)

We introduce ArtBench-10, the first class-balanced, high-quality, cleanly annotated, and standardized dataset for benchmarking artwork generation. It comprises 60,000 images of artwork from 10 distinctive artistic styles, with 5,000 training images and 1,000 testing images per style. ArtBench-10 has several advantages over previous artwork datasets. Firstly, it is class-balanced while most previous artwork datasets suffer from the long tail class distributions. Secondly, the images are of high quality with clean annotations. Thirdly, ArtBench-10 is created with standardized data collection, annotation, filtering, and preprocessing procedures. We provide three versions of the dataset with different resolutions (32×32, 256×256, and original image size), formatted in a way that is easy to be incorporated by popular machine learning frameworks.

7 papers2 benchmarksImages

NumtaDB (Assembled Bengali Handwritten Digits)

To benchmark Bengali digit recognition algorithms, a large publicly available dataset is required which is free from biases originating from geographical location, gender, and age. With this aim in mind, NumtaDB, a dataset consisting of more than 85,000 images of hand-written Bengali digits, has been assembled.

7 papers0 benchmarksImages

DOLPHINS (Dataset for Collaborative Perception enabled Harmonious and Interconnected Self-driving)

Vehicle-to-Everything (V2X) network has enabled collaborative perception in autonomous driving, which is a promising solution to the fundamental defect of stand-alone intelligence including blind zones and long-range perception. However, the lack of datasets has severely blocked the development of collaborative perception algorithms. In this work, we release DOLPHINS: Dataset for cOllaborative Perception enabled Harmonious and INterconnected Self-driving, as a new simulated large-scale various-scenario multi-view multi-modality autonomous driving dataset, which provides a ground-breaking benchmark platform for interconnected autonomous driving. DOLPHINS outperforms current datasets in six dimensions: temporally-aligned images and point clouds from both vehicles and Road Side Units (RSUs) enabling both Vehicle-to-Vehicle (V2V) and Vehicle-to-Infrastructure (V2I) based collaborative perception; 6 typical scenarios with dynamic weather conditions make the most various interconnected auton

7 papers0 benchmarksImages, Point cloud

YCB-Slide (YCB-Slide: A tactile interaction dataset)

The YCB-Slide dataset comprises of DIGIT sliding interactions on YCB objects. We envision this can contribute towards efforts in tactile localization, mapping, object understanding, and learning dynamics models. We provide access to DIGIT images, sensor poses, RGB video feed, ground-truth mesh models, and ground-truth heightmaps + contact masks (simulation only). This dataset is supplementary to the MidasTouch paper, a CoRL 2022 submission.

7 papers0 benchmarksImages

Interiorverse

Interiorverse is a high-quality indoor scene dataset with rich details, including complex furniture and decorations and it is rendered with GGX BRDF model, which has stronger material modeling capability than any BRDF models.

7 papers0 benchmarks3D, Images

KITTI360-EX

KITTI360-EX is a dataset for outer- and inner FoV expansion. It contains 76k pinhole images as well as 76k spherical images and is used for beyond-FoV estimation.

7 papers1 benchmarksImages

MMBody

The MMBody dataset provides human body data with motion capture, GT mesh, Kinect RGBD, and millimeter wave sensor data. See homepage for more details.

7 papers0 benchmarks3D, 3d meshes, Images, Point cloud, RGB-D

HBW (Human Bodies in the Wild)

Human Bodies in the Wild (HBW) is a validation and test set for body shape estimation. It consists of images taken in the wild and ground truth 3D body scans in SMPL-X topology. To create HBW, we collect body scans of 35 participants and register the SMPL-X model to the scans. Further each participant is photographed in various outfits and poses in front of a white background and uploads full-body photos of themselves taken in the wild. The validation and test set images are released. The ground truth shape is only released for the validation set.

7 papers0 benchmarks3D, Images, Texts

GeoDE

GeoDE is a geographically diverse dataset with 61,940 images from 40 classes and 6 world regions, and no personally identifiable information, collected through crowd-sourcing.

7 papers0 benchmarksImages

Aachen Day-Night v1.1 Benchmark

Aachen Day-Night v1.1 dataset is an extended version of the original Aachen Day-Night dataset. Besides the original query images, the Aachen Day-Night v1.1 dataset contains an additional 93 nighttime queries. In addition, it uses a larger 3D model containing additional images. These additional images were extracted from video sequences captured with different cameras. Please refer to Reference Pose Generation for Long-term Visual Localization via Learned Features and View Synthesis for more information.

7 papers3 benchmarksImages

OMMO

OMMO is a new benchmark for several outdoor NeRF-based tasks, such as novel view synthesis, surface reconstruction, and multi-modal NeRF. It contains complex objects and scenes with calibrated images, point clouds and prompt annotations.

7 papers0 benchmarksImages, Point cloud

BANDON

BANDON is a dataset for building change detection with off-nadir aerial images dataset, which is composed of off-Nadir image pairs of urban and rural areas. Overall, the BANDON dataset contains 2283 pairs of images, 2283 change labels,1891 BT-flows labels, 1891 pairs of segmentation labels, and 1891 pair of ST-offsets labels (test sets do not provide auxiliary annotations).

7 papers0 benchmarksImages

PreviousPage 61 of 164Next