3,275 machine learning datasets
3,275 dataset results
SKU110K-R is a dataset relabeled with oriented bounding boxes based on SKU110K. It is focused on evaluating oriented and densely packed object detection.
VRAI is a large-scale vehicle ReID dataset for UAV-based intelligent applications. The dataset consists of 137, 613 images of 13, 022 vehicle instances. The images of each vehicle instance are captured by cameras of two DJI consumer UAVs at different locations, with a variety of view angles and flight-altitudes (15m to 80m).
Histopathological characterization of colorectal polyps allows to tailor patients' management and follow up with the ultimate aim of avoiding or promptly detecting an invasive carcinoma. Colorectal polyps characterization relies on the histological analysis of tissue samples to determine the polyps malignancy and dysplasia grade. Deep neural networks achieve outstanding accuracy in medical patterns recognition, however they require large sets of annotated training images. We introduce UniToPatho, an annotated dataset of 9536 hematoxylin and eosin stained patches extracted from 292 whole-slide images, meant for training deep neural networks for colorectal polyps classification and adenomas grading. The slides are acquired through a Hamamatsu Nanozoomer S210 scanner at 20× magnification (0.4415 μm/px)
LReID is a benchmark for lifelong person reidentification. It has been built using existing datasets, and it consists of two subsets: LReID-Seen and LReID-Unseen.
SenseReID is a person re-identification dataset for evaluating ReID models. It is captured from real surveillance cameras and the person bounding boxes are obtained from state-of-the-art detection algorithm. The dataset contains 1,717 identities in total.
The RIMES database (Reconnaissance et Indexation de données Manuscrites et de fac similÉS / Recognition and Indexing of handwritten documents and faxes) was created to evaluate automatic systems of recognition and indexing of handwritten letters. Of particular interest are cases such as those sent by postal mail or fax by individuals to companies or administrations.
Scribble is a new outline dataset consisting of 200 images (150 train, 50 test) for each of 10 classes – basketball, chicken, cookie, cupcake, moon, orange, soccer, strawberry, watermelon and pineapple. All the images have a white background and were collected using search keywords on popular search engines. In each image, we obtain rough outlines for the image. We find the largest blob in the image after thresholding it into a black and white image. We fill the interior holes of the largest blob and obtain a smooth outline using the SavitzkyGolay filter.
This dataset aims at evaluating the License Plate Character Segmentation (LPCS) problem. The experimental results of the paper Benchmark for License Plate Character Segmentation were obtained using a dataset providing 101 on-track vehicles captured during the day. The video was recorded using a static camera in early 2015.
CAMO++ is a dataset for camouflaged object segmentation. This dataset increases the number of images with hierarchical pixel-wise ground-truths. The authors also provide a benchmark suite for the task of camouflaged instance segmentation.
DUO is a dataset for Underwater object detection for robot picking. The dataset contains a collection of diverse underwater images with more rational annotations.
The standard digital image database with and without chest lung nodules (JSRT database) was created(*1) by the Japanese Society of Radiological Technology (JSRT) in cooperation with the Japanese Radiological Society (JRS) in 1998. Since then, the JSRT database has been used by a number of researchers in the world for various research purposes such as image processing, image compression, evaluation of image display, computer-aided diagnosis (CAD), picture archiving and communication system (PACS), and for training and testing.
DISC21 is a benchmark for large-scale image similarity detection. This benchmark is used for the Image Similarity Challenge at NeurIPS'21 (ISC2021). The goal is to determine whether a query image is a modified copy of any image in a reference corpus of size 1~million. The benchmark features a variety of image transformations such as automated transformations, hand-crafted image edits and machine-learning based manipulations. This mimics real-life cases appearing in social media, for example for integrity-related problems dealing with misinformation and objectionable content. The strength of the image manipulations, and therefore the difficulty of the benchmark, is calibrated according to the performance of a set of baseline approaches. Both the query and reference set contain a majority of ``distractor'' images that do not match, which corresponds to a real-life needle-in-haystack setting, and the evaluation metric reflects that.
PAD (Purpose-driven Affordance Dataset) is a dataset for affordance detection, which refers to identifying the potential action possibilities of objects in an image, which is an important ability for robot perception and manipulation. The dataset consists of 4K images from 31 affordance and 72 object categories.
OpenForensics is a large-scale dataset posing a high level of challenges that is designed with face-wise rich annotations explicitly for face forgery detection and segmentation. With its rich annotations, the OpenForensics dataset has great potentials for research in both deepfake prevention and general human face detection.
SoundingEarth consists of co-located aerial imagery and audio samples all around the world.
Who's Waldo is a dataset of 270K image–caption pairs, depicting interactions of people, that is automatically mined from Wikimedia Commons. It is a benchmark dataset for person-centric visual grounding, the problem of linking between people named in a caption and people pictured in an image.
Over 1.5K images selected from the public Kaggle DR Detection dataset; Five DR grades (DR0 / DR1 / DR2 / DR3 / DR4), re-labeled by a panel of 45 experienced ophthalmologists; Eight retinal lesion classes, including microaneurysm, intraretinal hemorrhage, hard exudate, cotton-wool spot, vitreous hemorrhage, preretinal hemorrhage, neovascularization and fibrous proliferation; Over 34K expert-labeled pixel-level lesion segments; Multi-task, i.e., lesion segmentation, lesion classification, and DR grading.
WildReceipt is a collection of receipts. It contains, for each photo, of a list of OCRs - with bounding box, text, and class.
This is a new dataset of news headlines and their frames related to the issue of gun violence in the United States. This Gun Violence Frame Corpus (GVFC) was curated and annotated by journalism and communication experts. The articles in this dataset are drawn from a sample of news articles from a list of 30 top U.S. news websites defined in terms of traffic to the websites; and collected from four time periods over the course of 2018 in order to capture a diversity of articles.
This dataset, called RodoSol-ALPR dataset, contains 20,000 images captured by static cameras located at pay tolls owned by the Rodovia do Sol (RodoSol) concessionaire, which operates 67.5 kilometers of a highway (ES-060) in the Brazilian state of Espírito Santo.