TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Datasets

19,997 machine learning datasets

Filter by Modality

  • Images3,275
  • Texts3,148
  • Videos1,019
  • Audio486
  • Medical395
  • 3D383
  • Time series298
  • Graphs285
  • Tabular271
  • Speech199
  • RGB-D192
  • Environment148
  • Point cloud135
  • Biomedical123
  • LiDAR95
  • RGB Video87
  • Tracking78
  • Biology71
  • Actions68
  • 3d meshes65
  • Tables52
  • Music48
  • EEG45
  • Hyperspectral images45
  • Stereo44
  • MRI39
  • Physics32
  • Interactive29
  • Dialog25
  • Midi22
  • 6D17
  • Replay data11
  • Financial10
  • Ranking10
  • Cad9
  • fMRI7
  • Parallel6
  • Lyrics2
  • PSG2

19,997 dataset results

CMU Book Summary Dataset

This dataset contains plot summaries for 16,559 books extracted from Wikipedia, along with aligned metadata from Freebase, including book author, title, and genre.

3 papers0 benchmarksTexts

RGB Arabic Alphabets Sign Language Dataset

This paper introduces the RGB Arabic Alphabet Sign Language (AASL) dataset. AASL comprises 7,856 raw and fully labeled RGB images of the Arabic sign language alphabets, which to our best knowledge is the first publicly available RGB dataset. The dataset is aimed to help those interested in developing real-life Arabic sign language classification models. AASL was collected from more than 200 participants and with different settings such as lighting, background, image orientation, image size, and image resolution. Experts in the field supervised, validated and filtered the collected images to ensure a high-quality dataset. AASL is made available to the public on Kaggle.

3 papers0 benchmarksImages

ACL-Fig

ACL-Fig is a large-scale automatically annotated corpus consisting of 112,052 scientific figures extracted from 56K research papers in the ACL Anthology. The ACL-Fig-pilot dataset contains 1,671 manually labeled scientific figures belonging to 19 categories.

3 papers0 benchmarksImages, Texts

SkinCon (SKIN Concepts Dataset)

SkinCon is a skin disease dataset densely annotated by dermatologists. SkinCon includes 3230 images from the Fitzpatrick 17k skin disease (Fitzpatrick Skin Tone) dataset densely labelled with 48 clinical concepts, 22 of which have at least 50 images representing the concept. The concepts used were chosen by two dermatologists considering the clinical descriptor terms used to describe skin lesions. Examples include "plaque", "scale", and "erosion".

3 papers0 benchmarksImages, Medical

SurgT

SurgT is a dataset for benchmarking 2D Trackers in Minimally Invasive Surgery (MIS). It contains a total of 157 stereo endoscopic videos from 20 clinical cases, along with stereo camera calibration parameters.

3 papers0 benchmarksMedical, Videos

Apnea-ECG (PhysioNet Apnea-ECG Database)

The data consist of 70 records, divided into a learning set of 35 records (a01 through a20, b01 through b05, and c01 through c10), and a test set of 35 records (x01 through x35), all of which may be downloaded from this page. Recordings vary in length from slightly less than 7 hours to nearly 10 hours each. Each recording includes a continuous digitized ECG signal, a set of apnea annotations (derived by human experts on the basis of simultaneously recorded respiration and related signals), and a set of machine-generated QRS annotations (in which all beats regardless of type have been labeled normal). In addition, eight recordings (a01 through a04, b01, and c01 through c03) are accompanied by four additional signals (Resp C and Resp A, chest and abdominal respiratory effort signals obtained using inductance plethysmography; Resp N, oronasal airflow measured using nasal thermistors; and SpO2, oxygen saturation).

3 papers9 benchmarksMedical, Time series

TTStroke-21 ME22 (TTStroke-21 for MediaEval 2022)

TTStroke-21 for MediaEval 2022. The task is of interest to researchers in the areas of machine learning (classification), visual content analysis, computer vision and sport performance. We explicitly encourage researchers focusing specifically in domains of computer-aided analysis of sport performance.

3 papers3 benchmarks

Tasksource

Huggingface Datasets is a great library, but it lacks standardization, and datasets require preprocessing work to be used interchangeably. tasksource automates this and facilitates reproducible multi-task learning scaling.

3 papers0 benchmarksTexts

TTStroke-21 ME21 (TTStroke-21 for MediaEval 2021)

This task offers researchers an opportunity to test their fine-grained classification methods for detecting and recognizing strokes in table tennis videos. (The low inter-class variability makes the task more difficult than with usual general datasets like UCF-101.) The task offers two subtasks:

3 papers3 benchmarksRGB Video

FS-Mol

A Few-Shot Learning Dataset of Molecules.

3 papers0 benchmarks

DIVOTrack

DIVOTrack is a cross-view multi-object tracking dataset for DIVerse Open scenes with dense tracking pedestrians in realistic and non-experimental environments. DIVOTrack has ten distinct scenarios and 550 cross-view tracks.

3 papers0 benchmarksVideos

TT100K (Tsinghua-Tencent 100K(official training and testing set))

Trainging and testing data: The original training set includes 6105 images, and the original testing set includes 3071 images.

3 papers1 benchmarks

S-VED (Sacrobosco Visual Element Dataset)

The Sacrobosco Visual Elements Dataset (S-VED) is derived from 359 Sphaera editions, centered on the Tractatus de sphaera by Johannes de Sacrobosco (—1256) and printed between 1472 and 1650. The Sphaera editions were primarily used to teach geocentric astronomy to university students across Europe. Their visual elements, therefore, played an essential role in visualizing the ideas, messages, and concepts that the texts transmitted. As a precondition for studying the relation between text and visual elements, a time-consuming image labelling process was conducted as part of “The Sphere” project (https://sphaera.mpiwg-berlin.mpg.de) in order to extract and label the visual elements from the 76,000 pages of the corpus. This process resulted in the creation of the Extended Sacrobosco Visual Elements Dataset (S-VED𝑋) of which S-VED is a subset of. Due to copyright reasons only S-VED is made publicly available. S-VED consists of 4000 pages of which 2040 contain a total of 2927 visual element

3 papers0 benchmarksImages

Video Localized Narratives

Video Localized Narratives is a new form of multimodal video annotations connecting vision and language. The annotations are created from videos with Localized Narratives, capturing even complex events involving multiple actors interacting with each other and with several passive objects. It contains annotations of 20k videos of the OVIS, UVO, and Oops datasets, totalling 1.7M words.

3 papers0 benchmarksTexts

MultiQ

MultiQ is a multi-hop QA dataset for Russian, suitable for general open-domain question answering, information retrieval, and reading comprehension tasks.

3 papers1 benchmarksTexts

UESTC-MMEA-CL (A multi-modal egocentric activity dataset for continual learning)

UESTC-MMEA-CL is a new multi-modal activity dataset for continual egocentric activity recognition, which is proposed to promote future studies on continual learning for first-person activity recognition in wearable applications. Our dataset provides not only vision data with auxiliary inertial sensor data but also comprehensive and complex daily activity categories for the purpose of continual learning research. UESTC-MMEA-CL comprises 30.4 hours of fully synchronized first-person video clips, acceleration stream and gyroscope data in total. There are 32 activity classes in the dataset and each class contains approximately 200 samples. We divide the samples of each class into the training set, validation set and test set according to the ratio of 7:2:1. For the continual learning evaluation, we present three settings of incremental steps, i.e., the 32 classes are divided into {16, 8, 4} incremental steps and each step contains {2, 4, 8} activity classes, respectively.

3 papers0 benchmarksActions, RGB Video

FMD (materials) (Flickr Material Dataset)

Sharan, Lavanya, Ruth Rosenholtz, and Edward Adelson. "Material perception: What can you see in a brief glance?." Journal of Vision 9.8 (2009): 784-784. http://people.csail.mit.edu/celiu/CVPR2010/FMD/FMD.zip

3 papers1 benchmarks

OVRseen

https://athinagroup.eng.uci.edu/projects/ovrseen/

3 papers0 benchmarks

ATM’22

ATM'22 is a multi-site, multi-domain dataset for pulmonary airway segmentation. It contains large-scale CT scans with detailed pulmonary airways annotation, including 500 CT scans (300 for training, 50 for validation, and 150 for testing). The dataset was collected from different sites and it further included a portion of noisy COVID19 CTs with ground-glass opacity and consolidation.

3 papers0 benchmarksMedical

BUAA-MIHR dataset (Large-scale-Multi-illumination-HR-Database)

BUAA-MIHR dataset is a remote photoplethysmography (rPPG) dataset. BUAA-MIHR dataset for evaluation of remote photoplethysmography pipeline under multi-illumination situations. We recruited 15 healthy subjects (12 male, 3 female, 18 to 30 years old) in this experiment and a total number of 165 video sequences were recorded under various illuminations. The experiments were conducted in a darkroom in order to isolate from ambient light.

3 papers0 benchmarksVideos
PreviousPage 282 of 1000Next