Datasets

3,275 machine learning datasets

3,275 dataset results

Houses Dataset

This dataset is used for predicting house prices from both images and textual information. It is composed of 535 sample houses from California, USA.

1 papers0 benchmarksImages, Texts

Image Caption Quality Dataset

Image Caption Quality Dataset is a dataset of crowdsourced ratings for machine-generated image captions. It contains more than 600k ratings of image-caption pairs.

1 papers0 benchmarksImages

iNaturalist Fine-Grained Geolocation

The iNaturalist Fine-Grained Geolocation dataset is an extension of the iNaturalist dataset with complementary geolocation information.

1 papers0 benchmarksImages

JHU CoSTAR Block Stacking Dataset

Involves data where a robot interacts with 5.1 cm colored blocks to complete an order-fulfillment style block stacking task. It contains dynamic scenes and real time-series data in a less constrained environment than comparable datasets. There are nearly 12,000 stacking attempts and over 2 million frames of real data.

1 papers0 benchmarks3D, Images, Point cloud, RGB Video, RGB-D

Kinteract

Explicitly created for Human Computer Interaction (HCI).

1 papers0 benchmarksImages

Leaf counting dataset

Dataset containing 9372 RGB images of weeds with the number of leaves counted. The images are collected in fields across Denmark using Nokia and Samsung cell phone cameras; Samsung, Nikon, Canon and Sony consumer cameras; and a Point Grey industrial camera.

1 papers0 benchmarksImages

LEMMA

The LEMMA dataset aims to explore the essence of complex human activities in a goal-directed, multi-agent, multi-task setting with ground-truth labels of compositional atomic-actions and their associated tasks. By quantifying the scenarios to up to two multi-step tasks with two agents, the authors strive to address human multi-task and multi-agent interactions in four scenarios: single-agent single-task (1 x 1), single-agent multi-task (1 x 2), multi-agent single-task (2 x 1), and multi-agent multi-task (2 x 2). Task instructions are only given to one agent in the 2 x 1 setting to resemble the robot-helping scenario, hoping that the learned perception models could be applied in robotic tasks (especially in HRI) in the near future.

1 papers0 benchmarksImages

LISA Gaze Dataset

LISA Gaze is a dataset for driver gaze estimation comprising of 11 long drives, driven by 10 subjects in two different cars.

1 papers0 benchmarksImages

LSLF (Large-scale Labeled Face)

Consists of a large number of unconstrained multi-view and partially occluded faces.

1 papers0 benchmarksImages

METU-ALET

METU-ALET is an image dataset for the detection of the tools in the wild. The dataset has annotations for tools that belongs to the categories such as farming, gardening, office, stonemasonry, vehicle, woodworking and workshop. The images in the dataset contains a total of 22,841 bounding boxes and 49 different tool categories.

1 papers0 benchmarksImages

MLS (Multiple Light Source)

The Multiple Light Source dataset (MLS) is a collection of 24 multiple object scenes each recorded under 18 multiple light source illumination scenarios. The illuminants are varying in dominant spectral colours, intensity and distance from the scene. The dataset can be used for the evaluation of computational colour constancy algorithms. Along with the images of the scenes the spectral characteristics of the camera, light sources and the objects are also provided, and each image includes pixel-by-pixel ground truth annotation of uniformly coloured object surfaces thus making this useful for benchmarking colour-based image segmentation algorithms.

1 papers0 benchmarksImages

MNIST-MIX

MNIST-MIX is a multi-language handwritten digit recognition dataset. It contains digits from 10 different languages.

1 papers0 benchmarksImages

NavigationNet

NavigationNet is a computer vision dataset and benchmark to allow the utilization of deep reinforcement learning on scene-understanding-based indoor navigation.

1 papers0 benchmarksImages

PSU NRTDB (PSU Near-Regular Texture Database)

The PSU Near-Regular Texture Database is a texture dataset. It covers the spectrum of textures from completely regular to near-regular to irregular. It also includes video of near-regular textures in motion. The database also contains, or will include, test image sets with ground-truth for translation, rotation, reflection/glide-reflection symmetry detection algorithms.

1 papers0 benchmarksImages

NREC Agricultural Person-Detection

A dataset to encourage research in these environments. It consists of labeled stereo video of people in orange and apple orchards taken from two perception platforms (a tractor and a pickup truck), along with vehicle position data from RTK GPS.

1 papers0 benchmarksImages

Open MIC (Open Museum Identification Challenge)

Open MIC (Open Museum Identification Challenge) contains photos of exhibits captured in 10 distinct exhibition spaces of several museums which showcase paintings, timepieces, sculptures, glassware, relics, science exhibits, natural history pieces, ceramics, pottery, tools and indigenous crafts. The goal of Open MIC is to stimulate research in domain adaptation, egocentric recognition and few-shot learning by providing a testbed complementary to the famous Office 31.

1 papers0 benchmarksImages

pic2kcal

The pic2kal benchmark for calorie prediction contains 308,000 images from over 70,000 recipes including photographs, ingredients and instructions, matched with nutritional information.

1 papers0 benchmarksImages

Pinterest Complete The Look

The Pinterest Complete the Look dataset consists of over 1 million outfits and 4 million objects. It can be used to predict style compatibility between fashion items in order to recommend complementary items that complete an outfit.

1 papers0 benchmarksImages

Salient Closed Boundary Tracking

This dataset contains nine video sequences captured by a webcam for salient closed boundary tracking evaluation. Each sequence is about 30 sec (30 fps) and the frame size is 640×480 (width×height). There are 9598 frames in total. In each sequence, different motion styles such as translation, rotation and viewpoint changing are all performed.

1 papers0 benchmarksImages

Salient-KITTI

Salient-KITTI is a saliency map prediction dataset based on KITTI.

1 papers0 benchmarksImages

PreviousPage 111 of 164Next