TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Datasets

68 machine learning datasets

Filter by Modality

  • Images3,275
  • Texts3,148
  • Videos1,019
  • Audio486
  • Medical395
  • 3D383
  • Time series298
  • Graphs285
  • Tabular271
  • Speech199
  • RGB-D192
  • Environment148
  • Point cloud135
  • Biomedical123
  • LiDAR95
  • RGB Video87
  • Tracking78
  • Biology71
  • Actions68
  • 3d meshes65
  • Tables52
  • Music48
  • EEG45
  • Hyperspectral images45
  • Stereo44
  • MRI39
  • Physics32
  • Interactive29
  • Dialog25
  • Midi22
  • 6D17
  • Replay data11
  • Financial10
  • Ranking10
  • Cad9
  • fMRI7
  • Parallel6
  • Lyrics2
  • PSG2
Clear filter

68 dataset results

Two4Two (A Synthetic Dataset For Controlled Experiments)

Two4Two is a library to create synthetic image data crafted for human evaluations of interpretable ML approaches (esp. image classification). The synthetic images show two abstract animals: Peaky (arms inwards) and Stretchy (arms outwards). They are similar-looking, abstract animals, made of eight blocks. The core functionality of this library is that one can correlate different parameters with an animal type to create bias in the data.

1 papers0 benchmarksActions

Capture-24

This dataset contains Axivity AX3 wrist-worn activity tracker data that were collected from 151 participants in 2014-2016 around the Oxfordshire area. Participants were asked to wear the device in daily living for a period of roughly 24 hours, amounting to a total of almost 4,000 hours. Vicon Autograph wearable cameras and Whitehall II sleep diaries were used to obtain the ground truth activities performed during the period (e.g. sitting watching TV, walking the dog, washing dishes, sleeping), resulting in more than 2,500 hours of labelled data. Accompanying code to analyse this data is available at https://github.com/activityMonitoring/capture24. The following papers describe the data collection protocol in full: i.) Gershuny J, Harms T, Doherty A, Thomas E, Milton K, Kelly P, Foster C (2020) Testing self-report time-use diaries against objective instruments in real time. Sociological Methodology doi: 10.1177/0081175019884591; ii.) Willetts M, Hollowell S, Aslett L, Holmes C, Doherty

1 papers0 benchmarksActions, Tracking

Baxter-UR5_95-Objects

In this dataset two robots, Baxter and UR5, perform 8 behaviors (look, grasp, pick, hold, shake, lower, drop, and push) on 95 objects that vary by 5 color (blue, green, red, white, and yellow), 6 contents (wooden button, plastic dices, glass marbles, nuts & bolts, pasta, and rice), and 4 weights (empty, 50g, 100g, and 150g). There are 90 objects with contents (5 colors x 3 weights x 6 contents) and 5 objects without any content that only vary by 5 colors. Both robots perform 5 trials on each object, resulting in 7,600 interactions (2 robots x 8 behaviors x 95 objects x 5 trials

1 papers0 benchmarksActions, Audio, Images, Interactive, RGB Video, RGB-D, Time series, Videos

MotionID: IMU specific motion (User verification)

Dataset for User Verification part of MotionID: Human Authentication Approach. Data type: bin (should be converted by attached notebook). ~50 hours of IMU (Inertial Measurement Units) data for one specific motion pattern, provided by 101 users.

1 papers0 benchmarksActions, Time series

MotionID: IMU all motions part1 (Motion Patterns Identification)

Dataset (part 1/3) for Motion Patterns Identification part of MotionID: Human Authentication Approach. Data type: bin (should be converted by attached notebook).

1 papers0 benchmarksActions, Time series

MotionID: IMU all motions part2 (Motion Patterns Identification)

Dataset (part 2/3) for Motion Patterns Identification part of MotionID: Human Authentication Approach. Data type: bin (should be converted by attached notebook).

1 papers0 benchmarksActions, Time series

MotionID: IMU all motions part3 (Motion Patterns Identification)

Dataset (part 3/3) for Motion Patterns Identification part of MotionID: Human Authentication Approach. Data type: bin (should be converted by attached notebook).

1 papers0 benchmarksActions, Time series

UR5 Tool Dataset

In this dataset UR5 robot used 6 tools: metal-scissor, metal-whisk, plastic-knife, plastic-spoon, wooden-chopstick, and wooden-fork to perform 6 behaviors: look, stirring-slow, stirring-fast, stirring-twist, whisk, and poke. The robot explored 15 objects: cane-sugar, chia-seed, chickpea, detergent, empty, glass-bead, kidney-bean, metal-nut-bolt, plastic-bead, salt, split-green-pea, styrofoam-bead, water, wheat, and wooden-button kept cylindrical containers. The robot performed 10 trials on each object using a tool, resulting in 5,400 interactions (6 tools x 6 behaviors x 15 objects x 10 trials). The robot records multiple sensory data (audio, RGB images, depth images, haptic, and touch images) while interacting with the objects.

1 papers0 benchmarksActions, Audio, Images, Interactive, RGB Video, RGB-D, Time series, Videos

VFD-2000

VFD-2000 is a video fight detection dataset containing more than 2000 videos. YouTube is the data source. Specific scenarios are searched using “fight” as a search keyword, for example, “street fight”, “beach fight”, and “violence in the restaurant”. 200 videos under 20 different scenes are collected.

1 papers0 benchmarksActions, Videos

CY101 Dataset

In this dataset an uppertorso humanoid robot with 7-DOF arm explored 100 different objects belonging to 20 different categories using 10 behaviors: Look, Crush, Grasp, Hold, Lift, Drop, Poke, Push, Shake and Tap.

1 papers0 benchmarksActions, Audio, Images, Interactive, RGB Video, Texts, Time series, Videos

HA-ViD (HA-ViD: A Human Assembly Video Dataset)

Understanding comprehensive assembly knowledge from videos is critical for futuristic ultra-intelligent industry. To enable technological breakthrough, we present HA-ViD – an assembly video dataset that features representative industrial assembly scenarios, natural procedural knowledge acquisition process, and consistent human-robot shared annotations. Specifically, HA-ViD captures diverse collaboration patterns of real-world assembly, natural human behaviors and learning progression during assembly, and granulate action annotations to subject, action verb, manipulated object, target object, and tool. We provide 3222 multi-view and multi-modality videos, 1.5M frames, 96K temporal labels and 2M spatial labels. We benchmark four foundational video understanding tasks: action recognition, action segmentation, object detection and multi-object tracking. Importantly, we analyze their performance and the further reasoning steps for comprehending knowledge in assembly progress, process effici

1 papers0 benchmarksActions, Images, RGB Video, RGB-D, Tracking, Videos

LEARNING STYLE IDENTIFICATION (Learning Style Identification Using Semi-Supervised Self-Taught Labeling)

The dataset was collected from two courses offered on the University of Jordan's E-learning Portal during the second semester of 2020, namely "Computer Skills for Humanities Students" (CSHS) and "Computer Skills for Medical Students" (CSMS). Over the sixteen-week duration of each course, students participated in various activities such as reading materials, video lectures, assignments, and quizzes. To preserve student privacy, the log activity of each student was anonymized. Data was aggregated from multiple sources, including the Moodle learning management system and the student information system, and consolidated into a single database. The dataset contains information on the number of learners and events for each course, as well as their launch and end dates. CSHS had 1749 learners and 1,139,810 events from January 21, 2020 to May 20, 2020, while CSMS had 564 learners and 484,410 events during the same period. The dataset is based on the Filder and Silverman learning style model (F

1 papers0 benchmarksActions, Texts, Tracking

MuSoHu (Toward human-like social robot navigation: A large-scale, multi-modal, social human navigation dataset)

A large-scale, egocentric, multimodal, and context-aware dataset of human demonstrations of social navigation.

1 papers0 benchmarks3D, Actions, LiDAR, Point cloud, RGB-D, Stereo, Videos

SBA (Sequentail Brick Assembly Dataset)

The RAD (Randomly Assembled Object Construction) dataset is a synthetic 3D LEGO dataset designed for the task of Sequential Brick Assembly (SBA). Here are the key characteristics and details:

1 papers0 benchmarks3D, 3d meshes, Actions, Images

RClicks

We conducted a large crowdsourcing study of click patterns in an interactive segmentation scenario and collected 475K real-user clicks. Drawing on ideas from saliency tasks, we develop a clickability model that enables sampling clicks, which closely resemble actual user inputs. Using our model and dataset, we propose RClicks benchmark for a comprehensive comparison of existing interactive segmentation methods on realistic clicks. Specifically, we evaluate not only the average quality of methods, but also the robustness w.r.t. click patterns.

1 papers0 benchmarksActions, Images, Interactive, Tables, Tabular

DeformPAM-Dataset (Dataset of DeformPAM)

Two versions of the dataset are offered: one is the full dataset used to train the models in DeformPAM, and the other is a mini dataset for easier examination. Both datasets include data for the supervised and finetuning stages of granular pile shaping, rope shaping, and T-shirt unfolding.

1 papers0 benchmarksActions, Images, Point cloud

OmniLab

In order to evaluate the effectiveness of NToP in real-world scenarios, we collect a new dataset OmniLab with a top-view omnidirectional camera, mounted on the ceiling of two different rooms (bedroom, living room) at 2.5 m height. Five actors (3 males, 2 females) perform 15 actions from CMU-MoCap database (brooming, cleaning windows, down and get up, drinking, fall-on-face, in chair and stand up, pull object, push object, rugpull, turn left, turn right, upbend from knees, upbend from waist, up from ground, walk, walk-old-man) in two rooms with varying clothes. The recorded action length is 2.5 s, which results in 60 images for each scene at a frame rate of 24 FPS. The position of the camera is fixed and the resolution of the images is 1200 by 1200 pixels. A total of 4800 frames are collected. All annotations of 17 keypoints conforming to COCO conventions are estimated through a keypoint detector and subsequently refined by four different humans in two loops to ensure high annotation qu

1 papers0 benchmarksActions, Images

MS-HAB-Demonstrations (ManiSkill-HAB Demonstration Datasets)

Whole-body, low-level control/manipulation demonstration dataset for ManiSkill-HAB. Demonstrations are organized by task-subtask-object. All demos use RGBD (128x128) and state. JSON files store metadata (tincluding even labels and success/failure mode), while HDF5 files store demonstration data.

1 papers0 benchmarksActions, Images, RGB-D, Replay data

MIKASA-Robo Dataset

Click to add a brief description of the dataset (Markdown and LaTeX enabled).

1 papers0 benchmarksActions, Images, Replay data

LSDBench (Long-video Sampling Dilemma Benchmark)

A benchmark that focuses on the sampling dilemma in long-video tasks. The LSDBench dataset is designed to evaluate the sampling efficiency of long-video VLMs. It consists of multiple-choice question-answer pairs based on hour-long videos, focusing on dense and short-duration actions with high Necessary Sampling Density (NSD).

1 papers0 benchmarksActions, Images, Texts, Videos
PreviousPage 3 of 4Next