3,275 machine learning datasets
3,275 dataset results
Turath-150K is a database of images of the Arab world that reflect objects, activities, and scenarios commonly found there.
The Benchmark is a collection of datasets for Monocular Height Estimation. It consists of two datasets: GTAH and AHN.
Brown Pedestrian Odometry Dataset (BPOD) is a dataset for benchmarking visual odometry algorithms in head-mounted pedestrian settings. This dataset was captured using synchronized global and rolling shutter stereo cameras in 12 diverse indoor and outdoor locations on Brown University's campus. Compared to existing datasets, BPOD contains more image blur and self-rotation, which are common in pedestrian odometry but rare elsewhere. Ground-truth trajectories are generated from stick-on markers placed along the pedestrian’s path, and the pedestrian's position is documented using a third-person video.
The results-A dataset is a dataset consisting of 22 infrared images commonly used for testing performance of Infrared Image Super-Resolution models.
This is a dataset for Bengali Captioning from Images.
The UFPR-ADMR-v2 dataset contains 5,000 dial meter images obtained on-site by employees of the Energy Company of Paraná (Copel), which serves more than 4M consuming units in the Brazilian state of Paraná. The images were acquired with many different cameras and are available in the JPG format with 320×640 or 640×320 pixels (depending on the camera orientation). More details are available in our paper.
In this repository you can find all the elaborate results that were used for the simulated evaluation of an innovative, optimized for real-life use, STC-based, multi-robot Coverage Path Planning (mCPP) algorithm. For this evaluation were introduced in "Apostolidis, S. D., Kapoutsis, P. C., Kapoutsis, A. C., & Kosmatopoulos, E. B. (2022). Cooperative multi-UAV coverage mission planning platform for remote sensing applications. Autonomous Robots, 1-28." 20 ROIs, of different shapes and areas, that may include obstacles inside them. These ROIs along with some benchmark results can be found here: https://github.com/savvas-ap/cpp-simulated-evaluations
ISBNet is a dataset of images of recyclables. It is hand collected by our group at the International School of Beijing. The trash in these images was gathered from trash bins around the school. ISBNet totals 889 images distributed across 5 classes: cans (74), landfill (410), paper (182), plastic (122), and tetra pak (101). The data acquisition process involved using a piece of black poster paper as a background; this would create enough contrast for trash belonging to the paper category. These pictures were taken with an iPhone 8 and an iPhone XS. We recorded the trash bin in which the piece of trash originated from and any trash generating landmarks nearby. Please refer to the paper (ThanosNet: A Novel Trash Classification Method Using Metadata) for more about the format of the metadata.
CAT is a specialized dataset for co-saliency detection. This dataset is intended for both helping to assess the performance of vision algorithms and supporting research that aims to exploit large volumes of annotated data, e.g., for training deep neural networks.
Data for the paper entitled Quantifying yeast colony morphologies with feature engineering from time-lapse photography by A. Goldschmidt et al. (https://arxiv.org/abs/2201.05259)
BDD100K-weather is a dataset which is inherited from BDD100K using image attribute labels for Out-of-Distribution object detection. All images in BDD100K are categorized into six domains, including clear, overcast, foggy, partly cloudy, rainy and snowy. Clear and overcast are used for training while the rest is used for testing, moreover, per training domain is sampled 1.5k images at most while per testing domain is sampled 0.5k images at most. Thus, we have BDD100K-weather (paper is under review).
For the Drone-vs-Bird Detection Challenge 2021, 77 different video sequences have been made available as training data. These video sequences originate from the previous installment of the challenge and were collected using MPEG4-coded static cameras by the SafeShore project, by the Fraunhofer IOSB research institute and by the ALADDIN2 project. On average, the video sequences consist of 1,384 frames, while each frame contains 1.12 annotated drones. The video sequences are recorded with both static cameras and moving cameras and the resolution varies between 720×576 and 3840×2160 pixels. In total, 8 different types of drones exist in the dataset , i.e. 3 with fixed wings and 5 rotary ones. For each video, a separate annotation file is provided, which contains the frame number and the bounding box (expressed as [topx topy width height]) for the frames in which drones enter the scenes.
This csv consists of (x-position, y-position, area) tuples of three views (left, middle, right) of downscaled binary masks with aspect ratio kept (64 x 128) from the 2019 YouTube-VIS challenge, which can be found at https://competitions.codalab.org/competitions/20127#participate-get-data. Extracting pairs from this csv results in 234,652 transitions in the given statistics. These statistics can be used to augment ground truth factor distributions with natural transitions, which we demonstrate with spriteworld. For details, we refer to our paper, which can be found at https://openreview.net/forum?id=EbIDjBynYJ8.
https://github.com/zzr-idam/Under-Display-Camera-UAV
Optical images of printed circuit boards as well as detailed annotations of any text, logos, and surface-mount devices (SMDs). There are several hundred samples spanning a wide variety of manufacturing locations, sizes, node technology, applications, and more.
At the Isfahan Fertility and Infertility Center, semen samples were collected from fifteen patients. The sperm samples were fixed and stained using the Diff-Quick method. Using an Olympus CX21 microscope with a ×100 objective lens and a ×10 eyepiece and a Sony color camera (Model No SSC-DC58AP), 725 images were taken. The resolution of each image was 576×720 pixels. From these images, the sperm heads were cropped and classified into five classes by three specialists. The classes are Normal, Pyriform, Tapered, Amorphous, and Others. After the classification, only the samples which there was a collective consensus about their class were kept in the dataset. Four classes of Normal, Pyriform, Tapered, and Amorphous are included in this dataset. The resulting dataset of sperm heads denoted as Human Sperm Head Morphology dataset (HuSHeM) consists of four folders, each corresponding to a specific set of sperm shapes. The folder names reflect the shape of the contained images. There are 54
Dataset of sperm head images with expert-classification labels. The dataset contains 1854 sperm head images obtained from six semen smears and classified by three Chilean referent domain experts according to World Health Organization (WHO) criteria, in one of the following classes: normal, tapered, pyriform, small and amorphous. This gold-standard is aimed for use in evaluating and comparing not only known techniques, but also future improvements to present approaches for classification of human sperm heads for semen analysis.
Extension of the official KITTI'15 dataset. independently moving instance segmentation ground truth to cover all moving objects, not just a selection of cars and vans.
This is the small version of the MuMiN dataset.
This is the medium version of the MuMiN dataset.