Datasets

29 machine learning datasets

29 dataset results

Daimler Monocular Pedestrian Detection

The Daimler Monocular Pedestrian Detection dataset is a dataset for pedestrian detection in urban environments. The training set contains 15560 pedestrian samples (image cut-outs at 48×96 resolution) and 6744 additional full images without pedestrians for extracting negative samples. The test set contains an independent sequence with more than 21790 images and 56492 pedestrian labels (fully visible or partially occluded), captured from a vehicle during a 27 min driving through the urban traffic.

1 papers0 benchmarksImages, Interactive, Videos

VR Curve on Surface Drawing Dataset

The datasets includes curves drawn on 3D surfaces (triangle meshes) in Virtual Reality. A total of 2,880 curves were created using two different techniques by 20 users on 6 meshes. For each curve, a 3D curve executed by the user is provided, the projected curve created on the mesh, and the ground truth target curve on the mesh. For collecting the data, two different task types were employed, which are described in the paper.

1 papers0 benchmarks3D, Interactive

ChildCIdb (ChildCIdbv1)

A large-scale, first-of-its-kind database aimed at generating a better understanding of the way children interact with mobile devices during their development process. ChildCIdbv1 comprises data collected from 438 children, from 18 months to 8 years old, encompassing the first three development stages of Piaget's theory. Data collected spans interaction with screens using both finger and pen stylus, information regarding the previous experience of the child with mobile devices, the child’s grade level, and whether attention-deficit/hyperactivity disorder (ADHD) is present.

1 papers0 benchmarksInteractive

The Mafia Dataset

The Mafia Dataset was created to model the behavior of deceptive actors in the context of the Mafia game, as described in the paper “Putting the Con in Context: Identifying Deceptive Actors in the Game of Mafia”. We hope that this dataset will be of use to others studying the effects of deception on language use.

1 papers0 benchmarksDialog, Interactive, Texts

Baxter-UR5_95-Objects

In this dataset two robots, Baxter and UR5, perform 8 behaviors (look, grasp, pick, hold, shake, lower, drop, and push) on 95 objects that vary by 5 color (blue, green, red, white, and yellow), 6 contents (wooden button, plastic dices, glass marbles, nuts & bolts, pasta, and rice), and 4 weights (empty, 50g, 100g, and 150g). There are 90 objects with contents (5 colors x 3 weights x 6 contents) and 5 objects without any content that only vary by 5 colors. Both robots perform 5 trials on each object, resulting in 7,600 interactions (2 robots x 8 behaviors x 95 objects x 5 trials

1 papers0 benchmarksActions, Audio, Images, Interactive, RGB Video, RGB-D, Time series, Videos

UR5 Tool Dataset

In this dataset UR5 robot used 6 tools: metal-scissor, metal-whisk, plastic-knife, plastic-spoon, wooden-chopstick, and wooden-fork to perform 6 behaviors: look, stirring-slow, stirring-fast, stirring-twist, whisk, and poke. The robot explored 15 objects: cane-sugar, chia-seed, chickpea, detergent, empty, glass-bead, kidney-bean, metal-nut-bolt, plastic-bead, salt, split-green-pea, styrofoam-bead, water, wheat, and wooden-button kept cylindrical containers. The robot performed 10 trials on each object using a tool, resulting in 5,400 interactions (6 tools x 6 behaviors x 15 objects x 10 trials). The robot records multiple sensory data (audio, RGB images, depth images, haptic, and touch images) while interacting with the objects.

1 papers0 benchmarksActions, Audio, Images, Interactive, RGB Video, RGB-D, Time series, Videos

CY101 Dataset

In this dataset an uppertorso humanoid robot with 7-DOF arm explored 100 different objects belonging to 20 different categories using 10 behaviors: Look, Crush, Grasp, Hold, Lift, Drop, Poke, Push, Shake and Tap.

1 papers0 benchmarksActions, Audio, Images, Interactive, RGB Video, Texts, Time series, Videos

SSv2-Spatio-Temporal (Something Someting v2-Spatio-Temporal)

We use Something-Something v2 dataset to obtain the generation prompts and ground truth masks from real action videos. We filter out a set of 295 prompts. The details for this filtering are in the "Peekaboo: Interactive Video Generation via Masked-Diffusion" paper. We then use an off-the-shelf OWL-ViT-large open-vocabulary object detector to obtain the bounding box (bbox) annotations of the object in the videos. This set represents bbox and prompt pairs of real-world videos, serving as a test bed for both the quality and control of methods for generating realistic videos with spatio-temporal control.

1 papers0 benchmarksInteractive, Texts, Tracking, Videos

RClicks

We conducted a large crowdsourcing study of click patterns in an interactive segmentation scenario and collected 475K real-user clicks. Drawing on ideas from saliency tasks, we develop a clickability model that enables sampling clicks, which closely resemble actual user inputs. Using our model and dataset, we propose RClicks benchmark for a comprehensive comparison of existing interactive segmentation methods on realistic clicks. Specifically, we evaluate not only the average quality of methods, but also the robustness w.r.t. click patterns.

1 papers0 benchmarksActions, Images, Interactive, Tables, Tabular

PreviousPage 2 of 2