19,997 machine learning datasets
19,997 dataset results
FractureAtlas is a musculoskeletal bone fracture dataset with annotations for deep learning tasks like classification, localization, and segmentation. The dataset contains a total of 4,083 X-Ray images with annotation in COCO, VGG, YOLO, and Pascal VOC format. This dataset is made freely available for any purpose. The data provided within this work are free to copy, share or redistribute in any medium or format. The data might be adapted, remixed, transformed, and built upon. The dataset is licensed under a CC-BY 4.0 license. It should be noted that to use the dataset correctly, one needs to have knowledge of medical and radiology fields to understand the results and make conclusions based on the dataset. It's also important to consider the possibility of labeling errors.
Text2KGBench is a benchmark to evaluate the capabilities of language models to generate KGs from natural language text guided by an ontology. Given an input ontology and a set of sentences, the task is to extract facts from the text while complying with the given ontology (concepts, relations, domain/range constraints) and being faithful to the input sentences.
Real-CE is a real-world Chinese-English benchmark dataset for the task of STISR with the emphasis on restoring structurally complex Chinese characters. The benchmark provides 1,935/783 real LR-HR text image pairs (contains 33,789 text lines in total) for training/testing in 2× and 4× zooming modes, complemented by detailed annotations, including detection boxes and text transcripts.
LPR4M is a large-scale live commerce dataset, offering a significantly broader coverage of categories and diverse modalities such as video, image, and text. It contains 4M exactly matched〈clip, image〉pairs of 4M live clips, and 332k shop images. Each image has 12 clips with different product variations, e.g., viewpoint, scale, and occlusion.
PIPPA (Personal Interaction Pairs between People and AI) is a partially-synthetic dataset. The dataset comprises over 1 million utterances that are distributed across 26,000 conversation sessions and provides a rich resource for researchers and AI developers to explore and refine conversational AI systems in the context of role-play scenarios.
PUMaVOS is a dataset of challenging and practical use cases inspired by the movie production industry.
This repository contains the Zurich Transit Bus (ZTBus) dataset, which consists of data recorded during driving missions of electric city buses in Zurich, Switzerland. The data was collected over several years on two trolley buses as part of multiple research projects. It involves more than a thousand missions spanning across all seasons, each mission usually covering a full day of real operation. The ZTBus dataset contains detailed information on the vehicle’s power demand, propulsion system, odometry, global position, ambient temperature, door openings, number of passengers, dispatch patterns within the public transportation network, etc. All signals are synchronized in time and include an absolute timestamp in tabular form. The dataset can be used as a foundation for a variety of studies and analyses. For example, the data can serve as a basis for simulations to estimate the performance of different public transit vehicle types, or to evaluate and optimize control strategies of hybr
Sparrow-V0: A Reinforcement Learning Friendly Simulator for Mobile Robot
Spatial LibriSpeech is spatial audio dataset with over 650 hours of 19-channel audio, first-order ambisonics, and optional distractor noise. Spatial LibriSpeech is designed for machine learning model training, and it includes labels for source position, speaking direction, room acoustics and geometry.
A dataset for position-constrained robot grasp planning.
StoryBench is a multi-task benchmark to reliably evaluate the ability of text-to-video models to generate stories from a sequence of captions and their duration. It includes three datasets (DiDeMo, Oops, UVO) and three video generation tasks of increasing difficulty: action execution, where the next action must be generated starting from a conditioning video; story continuation, where a sequence of actions must be executed starting from a conditioning video; and story generation, where a video must be generated from only text prompts.
We collected 32 videos that record bee colony activity from different periods on several sunny days. The total size of the dataset is 3,562 frames and 43,169 annotations.
Temporal Video Inpainting Localization Dataset.
DGraphFin dataset which is pre-processed in TGN Style.
dacl10k stands for damage classification 10k images and is a multi-label semantic segmentation dataset for 19 classes (13 damages and 6 objects) present on bridges.
Video sequences captured at a field on Campus Kleinaltendorf (CKA), University of Bonn, captured by BonBot-I, an autonomous weeding robot. The data was captured by mounting an Intel RealSense D435i sensor with a nadir view of the ground.
This dataset contains 40,000 URLs of US federal environmental agency websites, along with links to captures in the Internet Archive Wayback Machine for 2016 and 2020 when present. It also contains the prevalence of 56 environmental terms and phrases and how the presence of those terms on the webpages changed from 2016 to 2020.
Sign languages are the primary means of communication for a large number of people worldwide. Recently, the availability of Sign language translation datasets have facilitated the incorporation of Sign language research in the NLP community. Though a wide variety of research focuses on improving translation systems for sign language, the lack of ample annotated resources hinders research in the data driven natural language processing community. In this resource paper, we introduce ISLTranslate, a translation dataset for continuous Indian Sign Language (ISL), consisting of 30k ISL-English sentence pairs. To the best of our knowledge, it is the first and largest translation dataset for continuous Indian Sign Language with corresponding English transcripts. We provide a detailed analysis of the dataset and examine the distribution of words and phrases covered in the proposed dataset. To validate the performance of existing end-to-end Sign language to spoken language translation systems, w
This is dataset for A TIME SERIES IS WORTH 64 WORDS: LONG-TERM FORECASTING WITH TRANSFORMERS We evaluate the performance of our proposed PatchTST on 8 popular datasets, including Weather, Traffic, Electricity, ILI and 4 ETT datasets (ETTh1, ETTh2, ETTm1, ETTm2). These datasets have been extensively utilized for benchmarking and publicly available on (Wu et al., 2021). The statistics of those datasets are summarized in Table 2. We would like to highlight several large datasets: Weather, Traffic, and Electricity. They have many more number of time series, thus the results would be more stable and less susceptible to overfitting than other smaller datasets.
The Mpox Close Skin Images dataset (MCSI) is a collection of skin images obtained from diverse public sources, that we accurately pre-processed (i.e., cropped and zoomed) in order to focus the skin lesion (if present), and to evaluate Machine Learning models aimed at detecting different pathologies from skin lesion pictures taken with smartphone cameras. It includes a total of 400 pictures homogeneously divided in 4 different classes: mpox, which contains samples of mpox (formerly Monkeypox) skin lesions; chickenpox, with samples of chickenpox cases; acne, containing samples of acne at different severity levels; and healthy, which contains samples of skin without any evident symptoms. This repository is part of the supplementary material accompanying the paper named: A Transfer Learning and Explainable Solution to Detect mpox from Smartphones images.