19,997 machine learning datasets
19,997 dataset results
This dataset contains plot summaries for 16,559 books extracted from Wikipedia, along with aligned metadata from Freebase, including book author, title, and genre.
This paper introduces the RGB Arabic Alphabet Sign Language (AASL) dataset. AASL comprises 7,856 raw and fully labeled RGB images of the Arabic sign language alphabets, which to our best knowledge is the first publicly available RGB dataset. The dataset is aimed to help those interested in developing real-life Arabic sign language classification models. AASL was collected from more than 200 participants and with different settings such as lighting, background, image orientation, image size, and image resolution. Experts in the field supervised, validated and filtered the collected images to ensure a high-quality dataset. AASL is made available to the public on Kaggle.
ACL-Fig is a large-scale automatically annotated corpus consisting of 112,052 scientific figures extracted from 56K research papers in the ACL Anthology. The ACL-Fig-pilot dataset contains 1,671 manually labeled scientific figures belonging to 19 categories.
SkinCon is a skin disease dataset densely annotated by dermatologists. SkinCon includes 3230 images from the Fitzpatrick 17k skin disease (Fitzpatrick Skin Tone) dataset densely labelled with 48 clinical concepts, 22 of which have at least 50 images representing the concept. The concepts used were chosen by two dermatologists considering the clinical descriptor terms used to describe skin lesions. Examples include "plaque", "scale", and "erosion".
SurgT is a dataset for benchmarking 2D Trackers in Minimally Invasive Surgery (MIS). It contains a total of 157 stereo endoscopic videos from 20 clinical cases, along with stereo camera calibration parameters.
The data consist of 70 records, divided into a learning set of 35 records (a01 through a20, b01 through b05, and c01 through c10), and a test set of 35 records (x01 through x35), all of which may be downloaded from this page. Recordings vary in length from slightly less than 7 hours to nearly 10 hours each. Each recording includes a continuous digitized ECG signal, a set of apnea annotations (derived by human experts on the basis of simultaneously recorded respiration and related signals), and a set of machine-generated QRS annotations (in which all beats regardless of type have been labeled normal). In addition, eight recordings (a01 through a04, b01, and c01 through c03) are accompanied by four additional signals (Resp C and Resp A, chest and abdominal respiratory effort signals obtained using inductance plethysmography; Resp N, oronasal airflow measured using nasal thermistors; and SpO2, oxygen saturation).
TTStroke-21 for MediaEval 2022. The task is of interest to researchers in the areas of machine learning (classification), visual content analysis, computer vision and sport performance. We explicitly encourage researchers focusing specifically in domains of computer-aided analysis of sport performance.
Huggingface Datasets is a great library, but it lacks standardization, and datasets require preprocessing work to be used interchangeably. tasksource automates this and facilitates reproducible multi-task learning scaling.
This task offers researchers an opportunity to test their fine-grained classification methods for detecting and recognizing strokes in table tennis videos. (The low inter-class variability makes the task more difficult than with usual general datasets like UCF-101.) The task offers two subtasks:
A Few-Shot Learning Dataset of Molecules.
DIVOTrack is a cross-view multi-object tracking dataset for DIVerse Open scenes with dense tracking pedestrians in realistic and non-experimental environments. DIVOTrack has ten distinct scenarios and 550 cross-view tracks.
Trainging and testing data: The original training set includes 6105 images, and the original testing set includes 3071 images.
The Sacrobosco Visual Elements Dataset (S-VED) is derived from 359 Sphaera editions, centered on the Tractatus de sphaera by Johannes de Sacrobosco (—1256) and printed between 1472 and 1650. The Sphaera editions were primarily used to teach geocentric astronomy to university students across Europe. Their visual elements, therefore, played an essential role in visualizing the ideas, messages, and concepts that the texts transmitted. As a precondition for studying the relation between text and visual elements, a time-consuming image labelling process was conducted as part of “The Sphere” project (https://sphaera.mpiwg-berlin.mpg.de) in order to extract and label the visual elements from the 76,000 pages of the corpus. This process resulted in the creation of the Extended Sacrobosco Visual Elements Dataset (S-VED𝑋) of which S-VED is a subset of. Due to copyright reasons only S-VED is made publicly available. S-VED consists of 4000 pages of which 2040 contain a total of 2927 visual element
Video Localized Narratives is a new form of multimodal video annotations connecting vision and language. The annotations are created from videos with Localized Narratives, capturing even complex events involving multiple actors interacting with each other and with several passive objects. It contains annotations of 20k videos of the OVIS, UVO, and Oops datasets, totalling 1.7M words.
MultiQ is a multi-hop QA dataset for Russian, suitable for general open-domain question answering, information retrieval, and reading comprehension tasks.
UESTC-MMEA-CL is a new multi-modal activity dataset for continual egocentric activity recognition, which is proposed to promote future studies on continual learning for first-person activity recognition in wearable applications. Our dataset provides not only vision data with auxiliary inertial sensor data but also comprehensive and complex daily activity categories for the purpose of continual learning research. UESTC-MMEA-CL comprises 30.4 hours of fully synchronized first-person video clips, acceleration stream and gyroscope data in total. There are 32 activity classes in the dataset and each class contains approximately 200 samples. We divide the samples of each class into the training set, validation set and test set according to the ratio of 7:2:1. For the continual learning evaluation, we present three settings of incremental steps, i.e., the 32 classes are divided into {16, 8, 4} incremental steps and each step contains {2, 4, 8} activity classes, respectively.
Sharan, Lavanya, Ruth Rosenholtz, and Edward Adelson. "Material perception: What can you see in a brief glance?." Journal of Vision 9.8 (2009): 784-784. http://people.csail.mit.edu/celiu/CVPR2010/FMD/FMD.zip
https://athinagroup.eng.uci.edu/projects/ovrseen/
ATM'22 is a multi-site, multi-domain dataset for pulmonary airway segmentation. It contains large-scale CT scans with detailed pulmonary airways annotation, including 500 CT scans (300 for training, 50 for validation, and 150 for testing). The dataset was collected from different sites and it further included a portion of noisy COVID19 CTs with ground-glass opacity and consolidation.
BUAA-MIHR dataset is a remote photoplethysmography (rPPG) dataset. BUAA-MIHR dataset for evaluation of remote photoplethysmography pipeline under multi-illumination situations. We recruited 15 healthy subjects (12 male, 3 female, 18 to 30 years old) in this experiment and a total number of 165 video sequences were recorded under various illuminations. The experiments were conducted in a darkroom in order to isolate from ambient light.