3,275 machine learning datasets
3,275 dataset results
The Mouse Embryo Tracking Database is a dataset for tracking mouse embryos. The dataset contains, for each of the 100 examples: (1) the uncompressed frames, up to the 10th frame after the appearance of the 8th cell; (2) a text file with the trajectories of all the cells, from appearance to division (for cells of generations 1 to 3), where a trajectory is a sequence of pairs (center, radius); (3) a movie file showing the trajectories of the cells.
A vehicle detection database for vision tasks set in the real world.
OpenSurfaces is a large database of annotated surfaces created from real-world consumer photographs. The framework used for the annotation process draws on crowdsourcing to segment surfaces from photos, and then annotate them with rich surface properties, including material, texture and contextual information.
A novel 360◦ fisheye panoramas dataset, i.e., the Spherical-Navi image dataset is collected, with a unique labeling strategy enabling automatic generation of an arbitrary number of negative samples (wrong heading direction).
Real-world Affective Faces Multi Label (RAF-ML) is a multi-label facial expression dataset with around 5K great-diverse facial images downloaded from the Internet with blended emotions and variability in subjects' identity, head poses, lighting conditions and occlusions. During annotation, 315 well-trained annotators are employed to ensure each image can be annotated enough independent times. And images with multi-peak label distribution are selected out to constitute the RAF-ML.
The Robot-at-Home dataset (Robot@Home) is a collection of raw and processed data from five domestic settings compiled by a mobile robot equipped with 4 RGB-D cameras and a 2D laser scanner. Its main purpose is to serve as a testbed for semantic mapping algorithms through the categorization of objects and/or rooms.
Pedestrian Color Naming (PCN) is a dataset for pedestrian color naming, which contains 14,213 images, each of which hand-labeled with color label for each pixel. All images in the PCN dataset are obtained from the Market- 1501 dataset.
The MICCAI 2020 EMIDEC dataset is a dataset for classifying normal and pathological cases from the clinical information with or without DE-MRI, and secondly to automatically detect the different relevant areas (the myocardial contours, the infarcted area and the permanent microvascular obstruction area (no-reflow area)) from a series of short-axis DE-MRI covering the left ventricle. The segmentation allows one to make a quantification of the MI, in absolute value (mm3) or percentage of the myocardium.
This is a Dataset for Arabic/English text detection and optical character recognition. All image data are text-slides extracted from PowerPoint files downloaded from Internet through the Google API. All annotations are automatically generated mainly through the WinCom32 Python API. Postprocess is also applied to place a more accurate text bounding box or to suppress false-alarms, e.g. a text box only containing spaces. Finally, all annotation results are briefly reviewed by human to reject extreme bad samples, e.g. a slide with a large portion of copied table as image. In summary, this dataset contains 10,692 images, and roughly 100K line samples.
The Daimler Urban Segmentation Dataset is a dataset for semantic segmentation. It consists of video sequences recorded in urban traffic. The dataset consists of 5000 rectified stereo image pairs with a resolution of 1024x440. 500 frames (every 10th frame of the sequence) come with pixel-level semantic class annotations into 5 classes: ground, building, vehicle, pedestrian, sky. Dense disparity maps are provided as a reference, however these are not manually annotated but computed using semi-global matching (sgm).
The Interestingness dataset contains movie excerpts and key-frames and corresponding ground truth files based on classification into interesting and non-interesting samples. It is used for multimedia content interestingness classification. The dataset is composed of:
LASIESTA (Labeled and Annotated Sequences for Integral Evaluation of SegmenTation Algorithms) is a segmentation and detection dataset composed by many real indoor and outdoor sequences organized into categories, each of one covering a specific challenge in moving object detection strategies.
The Edge Milling Heads data set comprises 144 images of an edge profile cutting head of a milling machine. The head tool contains a total of 30 cutting inserts. The cutting head is formed by 6 diagonals of inserts in radial direction along the tool perimeter, encompassing 5 inserts per diagonal in axial direction. Positions of the last and first inserts of consecutive diagonals are aligned in the same vertical. Therefore, even though there are 30 inserts in total, there are 24 equally spaced positions of inserts along the tool perimeter. Additionally, inserts are squared shape with four 90º indexable cutting edges. Inserts are fastened with a screw. Rake angle is 0.
The VXC TSG is based on samples taken from the ceramic tile industry and is comprised of 14 ceramic tile models, 42 surface grades and 960 pieces. It has been built in the VxC laboratory, at the Polytechnic University of Valencia, in collaboration with Keraben S.A., a large ceramic tile company located at Nules province of Castellón (Spain).
The Panoramic Image Database is a panoramic image dataset. The databases were collected by Andrew Vardy while visiting with the Computer Engineering group in February and March of 2004. Images were captured by a robot-mounted camera, pointed upwards at a hyperbolic mirror. The camera was an ImagingSource DFK 4303. The robot was an ActivMedia Pioneer 3-DX. The mirror was a large wide-view hyperbolic mirror from Accowle Ltd. The hyperbolic mirror expands the camera's field of view to allow the capture of panoramic images.
OCTCBVS is a benchmark dataset for testing and evaluating novel and state-of-the-art computer vision algorithms. The benchmark contains videos and images recorded in and beyond the visible spectrum and is available for free to all researchers in the international computer vision communities.
CUHK occlusion dataset includes 1,063 images with occluded pedestrians. It is used for Human Detection with occlusion handling in crowded scenes.
H3D (Humans in 3D) is a dataset of annotated people. The annotations include:
This dataset contains video shots for two different classes: tigers and cars. The shots were collected from 188 car ads (~1–2 min each) and 14 nature documentaries about tigers (~40 min), amounting to roughly 14 h of video. The videos were partitioned into shorter shots, and only those showing at least one instance of the class were kept. This produced 806 shots for the car and 1880 for the tiger class, typically 1–100 sec in length.
The POET (Pascal Objects Eye Tracking) is a dataset that consists of eye tracking data for the complete trainval set of ten objects classes (cat, dog, bicycle, motorbike, boat, aeroplane, horse, cow, sofa, dining table) from Pascal VOC 2012 (6,270 images in total). Each image is annotated with the eye movement record of five participants, whose task was to identify which object class was present in the image.