19,997 machine learning datasets
19,997 dataset results
We introduce a new dataset, Watch and Learn Time-lapse (WALT), consisting of multiple (4K and 1080p) cameras capturing urban environments over a year.
RefMatte is the first large-scale challenging dataset under the task referring image matting, generated by a comprehensive image composition and expression generation engine on top of current public high-quality matting foregrounds with flexible logics and re-labelled diverse attributes. RefMatte consists of 230 object categories, 47,500 images, 118,749 expression-region entities, and 474,996 expressions, which can be further extended easily in the future.
The WikiTables-TURL dataset was constructed by the authors of TURL and is based on the WikiTable corpus, which is a large collection of Wikipedia tables. The dataset consists of 580,171 tables divided into fixed training, validation and testing splits. Additionally, the dataset contains metadata about each table, such as the table name, table caption and column headers.
This dataset is a combination of the following three datasets : figshare, SARTAJ dataset and Br35H
We introduce ArtBench-10, the first class-balanced, high-quality, cleanly annotated, and standardized dataset for benchmarking artwork generation. It comprises 60,000 images of artwork from 10 distinctive artistic styles, with 5,000 training images and 1,000 testing images per style. ArtBench-10 has several advantages over previous artwork datasets. Firstly, it is class-balanced while most previous artwork datasets suffer from the long tail class distributions. Secondly, the images are of high quality with clean annotations. Thirdly, ArtBench-10 is created with standardized data collection, annotation, filtering, and preprocessing procedures. We provide three versions of the dataset with different resolutions (32×32, 256×256, and original image size), formatted in a way that is easy to be incorporated by popular machine learning frameworks.
To benchmark Bengali digit recognition algorithms, a large publicly available dataset is required which is free from biases originating from geographical location, gender, and age. With this aim in mind, NumtaDB, a dataset consisting of more than 85,000 images of hand-written Bengali digits, has been assembled.
smac+ offensive near scenario with 20 parallel episodic buffer
SMAC+ offense distant scenario.
smac+ offensive complicated scenario with 20 parallel episodic buffer.
Vehicle-to-Everything (V2X) network has enabled collaborative perception in autonomous driving, which is a promising solution to the fundamental defect of stand-alone intelligence including blind zones and long-range perception. However, the lack of datasets has severely blocked the development of collaborative perception algorithms. In this work, we release DOLPHINS: Dataset for cOllaborative Perception enabled Harmonious and INterconnected Self-driving, as a new simulated large-scale various-scenario multi-view multi-modality autonomous driving dataset, which provides a ground-breaking benchmark platform for interconnected autonomous driving. DOLPHINS outperforms current datasets in six dimensions: temporally-aligned images and point clouds from both vehicles and Road Side Units (RSUs) enabling both Vehicle-to-Vehicle (V2V) and Vehicle-to-Infrastructure (V2I) based collaborative perception; 6 typical scenarios with dynamic weather conditions make the most various interconnected auton
The dataset published here is the largest, most diverse and consistent crack segmentation dataset constructed so far. It contains 9255 images that combine different smaller open source datasets. It consists of 10 sub datasets preprocessed and resized to 400x400 namely, Crack500, Deepcrack, Sdnet, Cracktree, Gaps, Volker, Rissbilder, Noncrack, Masonry and Ceramic.
FathomNet is an open-source image database that can be used to train, test, and validate state-of-the-art artificial intelligence algorithms to help us understand our ocean and its inhabitants. Inspired by annotated image databases such as ImageNet and COCO, FathomNet aims to establish the same kind of reference data set for images of ocean life. The long-term goal of FathomNet is to aggregate >1k fully annotated and localized images per marine species of Animalia (>200k), with the ability to expand and include other underwater concepts (e.g., substrate type, equipment, debris, etc.) for training and validating machine learning models. We hope that contributions from the broader community will realize our goals for FathomNet.
Recent salient object detection (SOD) methods based on deep neural network have achieved remarkable performance. However, most of existing SOD models designed for low-resolution input perform poorly on high-resolution images due to the contradiction between the sampling depth and the receptive field size. Aiming at resolving this contradiction, we propose a novel one-stage framework called Pyramid Grafting Network (PGNet), using transformer and CNN backbone to extract features from different resolution images independently and then graft the features from transformer branch to CNN branch. An attention-based Cross-Model Grafting Module (CMGM) is proposed to enable CNN branch to combine broken detailed information more holistically, guided by different source feature during decoding process. Moreover, we design an Attention Guided Loss (AGL) to explicitly supervise the attention matrix generated by CMGM to help the network better interact with the attention from different models. We cont
The CocoChorales Dataset CocoChorales is a dataset consisting of over 1400 hours of audio mixtures containing four-part chorales performed by 13 instruments, all synthesized with realistic-sounding generative models. CocoChorales contains mixes, sources, and MIDI data, as well as annotations for note expression (e.g., per-note volume and vibrato) and synthesis parameters (e.g., multi-f0).
H-DIBCO 2016 is the international Handwritten Document Image Binarization Contest organized in the context of ICFHR 2016 conference
DIBCO 2011 is the International Document Image Binarization Contest organized in the context of ICDAR 2011 conference. The general objective of the contest is to identify current advances in document image binarization for both machine-printed and handwritten document images using evaluation performance measures that conform to document image analysis and recognition.
Prediction of Finger Flexion IV Brain-Computer Interface Data Competition The goal of this dataset is to predict the flexion of individual fingers from signals recorded from the surface of the brain (electrocorticography (ECoG)). This data set contains brain signals from three subjects, as well as the time courses of the flexion of each of five fingers. The task in this competition is to use the provided flexion information in order to predict finger flexion for a provided test set. The performance of the classifier will be evaluated by calculating the average correlation coefficient r between actual and predicted finger flexion.
ACES a dataset consisting of 68 phenomena ranging from simple perturbations at the word/character level to more complex errors based on discourse and real-world knowledge. It can be used to evaluate a wide range of Machine Translation metrics.
The YCB-Slide dataset comprises of DIGIT sliding interactions on YCB objects. We envision this can contribute towards efforts in tactile localization, mapping, object understanding, and learning dynamics models. We provide access to DIGIT images, sensor poses, RGB video feed, ground-truth mesh models, and ground-truth heightmaps + contact masks (simulation only). This dataset is supplementary to the MidasTouch paper, a CoRL 2022 submission.
Interiorverse is a high-quality indoor scene dataset with rich details, including complex furniture and decorations and it is rendered with GGX BRDF model, which has stronger material modeling capability than any BRDF models.