3,275 machine learning datasets
3,275 dataset results
A public open dataset of synthetic chest X-ray images of COVID-19.
RaindropsOnWindshield is a dataset for training and assessing vision algorithms' performance for different tasks of image artifacts detection on either camera lens or windshield. The dataset contains 8190 images, of which 3390 contain raindrops. Images are annotated with the binary mask representing areas with raindrops.
The Inpainting dataset consists of synchronized Labeled image and LiDAR scanned point clouds. It's captured by HESAI Pandora All-in-One Sensing Kit. It is collected under various lighting conditions and traffic densities in Beijing, China.
This is a video and image segmentation dataset for human head and shoulders, relevant for creating elegant media for videoconferencing and virtual reality applications. The source data includes ten online conference-style green screen videos. The authors extracted 3600 frames from the videos and generated the ground truth masks for each character in the video, and then applied virtual background to the frames to generate the training/testing sets.
Stickers is a dataset consisting of 577 high-quality sticker images with alpha channel.
CPM-Real is a dataset consisting of 3895 images representing real - makeup styles.
MAI is a dataset for multi-scene recognition in single aerial images. It consists of 3,923 labelled large-scale images from Google Earth imagery that covers the United States, Germany, and France. The size of each image is 512 ×512, and spatial resolutions vary from 0.3 m/pixel to 0.6 m/pixel. After capturing aerial images, multiple scene-level labels were manually assigned to each image from in total 24 scene categories, including apron, baseball, beach, commercial, farmland, woodland, parking lot, port, residential, river, storage tanks, sea, bridge, lake, park, roundabout, soccer field, stadium, train station, works, golf course, runway, sparse shrub, and tennis court
The dataset consists of images of 158 filled out bank checks containing various complex backgrounds, and handwritten text and signatures in the respective fields, along with both pixel-level and patch-level segmentation masks for the signatures on the checks. Please visit the dataset homepage for more details.
Risk-Aware Planning is a dataset that contains the overhead images and their semantic segmentation captured by a drone from the CityEnviron environment in AirSim simulator.
PFN-VT is a dataset for the estimation of tactile properties from vision, such as slipperiness or roughness. The dataset is collected with a webcam and uSkin tactile sensor mounted on the end-effector of a Sawyer robot, which strokes the surfaces of 25 different materials.
The image collection of the IAPR TC-12 Benchmark consists of 20,000 still natural images taken from locations around the world and comprising an assorted cross-section of still natural images. This includes pictures of different sports and actions, photographs of people, animals, cities, landscapes, and many other aspects of contemporary life. Each image is associated with a text caption in up to three different languages (English, German and Spanish).
This dataset contains 12,500 meter images acquired in the field by the employees of the Energy Company of Paraná (Copel), which directly serves more than 4 million consuming units, across 395 cities and 1,113 locations (i.e., districts, villages and settlements), located in the Brazilian state of Paraná.
SyntheticFur is a dataset for neural rendering. Collecting and generating high quality fur images is an expensive and difficult process that requires content specialists to generate. By releasing this unique dataset with high quality lighting simulation via ray tracing, this can save time for researchers seeking to advance studies of fur rendering and simulation, without having to recreate this laborious process.
The UAVVaste dataset consists to date of 772 images and 3716 annotations. The main motivation for creation of the dataset was the lack of domain-specific data. The datasets that are widely used for object detection evaluation benchmarking. The dataset is made publicly available and is intended to be expanded.
About the Dataset: 4 classes of drinking waste: Aluminium Cans, Glass bottles, PET (plastic) bottles and HDPE (plastic) Milk bottles. rawimgs - images of 4 classes of waste YOLO_imgs - images of 4 classes of waste with corresponding txt file (annotations for YOLO framework) labels.txt - labels of the classes
Boombox is a multi-modal dataset for visual reconstruction from acoustic vibrations. Involves dropping objects into a box and capturing resulting images and vibrations. Used for training ML systems that predict images from vibration.
The ARC-100 dataset was collected as part of a prototype retail checkout system titled ARC (Automatic Retail Checkout). It consists of 31,000 $640\times480$ RGB images of 100 commonly found retail items in Lahore, Pakistan. Each retail item has 310 images captured at various logical orientations (on a black, matte finish conveyor belt) by a Logitech C310 webcam, under a wooden hood frame illuminated by LED strips (luminance set to approximately $70lx$). In the proposed setup, images were pre-processed and standardized before feeding into a Convolutional Neural Network for identification.
DanbooRegion is a dataset consists of 5377 in-the-wild illustration downloaded from the Danbooru2018 and region segment map annotation pairs
Clarkson Fingerprint Generator consists of a dataset of 50K synthetically generated fingerprints.
In ICDAR-17, a Page-Object Detection (POD) competition was organized where the task was to identify page objects in documents which includes tables, figures and equations in document. The dataset was composed of 2417 images in total, where 1600 images were used for training, while the rest of the 817 images were used for testing. We are introducing a new table structure recognition dataset, TabStructDB, where we labeled each tabular region present in the ICDAR-17 POD dataset with table structure information comprising of the row and column information.