19,997 machine learning datasets
19,997 dataset results
WHOI-Plankton is a collection of annotated plankton images. It contains > 3.5 million images of microscopic marine plankton, organized according to category labels provided by researchers at the Woods Hole Oceanographic Institution (WHOI). The images are currently placed into one of 103 categories.
The WordNet Language Model Probing (WNLaMPro) dataset consists of relations between keywords and words. It contains 4 different kinds of relations: Antonym, Hypernym, Cohyponym and Corruption.
A new dataset with significant occlusions related to object manipulation.
Dataset for large-scale yoga pose recognition with 82 classes.
MIT Traffic is a dataset for research on activity analysis and crowded scenes. It includes a traffic video sequence of 90 minutes long. It is recorded by a stationary camera. The size of the scene is 720 by 480 and it is divided into 20 clips.
MovieShots is a dataset to facilitate the shot type analysis in videos. It is a large-scale shot type annotation set that contains 46K shots from 7,858 movies covering a wide variety of movie genres to ensure the inclusion of all scale and movement types of shot. Each shot has two attributes, shot scale and shot movement.
ARID is a dataset for action recognition in dark videos. It consists of over 3,780 video clips with 11 action categories.
A dataset for building models that detect people Looking At Each Other (LAEO) in video sequences.
UIT-ViIC contains manually written captions for images from Microsoft COCO dataset relating to sports played with ball. UIT-ViIC consists of 19,250 Vietnamese captions for 3,850 images.
JParaCrawl is a parallel corpus for English-Japanese, for which the amount of publicly available parallel corpora is still limited. The parallel corpus was constructed by broadly crawling the web and automatically aligning parallel sentences. The corpus amassed over 8.7 million sentence pairs.
300 Videos in the Wild (300-VW) is a dataset for evaluating facial landmark tracking algorithms in the wild. The dataset authors collected a large number of long facial videos recorded in the wild. Each video has duration of ~1 minute (at 25-30 fps). All frames have been annotated with regards to the same mark-up (i.e. set of facial landmarks) used in the 300 W competition as well (a total of 68 landmarks). The dataset includes 114 videos (circa 1 min each).
MLe2 is a dataset for the evaluation of scene text end-to-end reading systems and all intermediate stages such as text detection, script identification and text recognition. The dataset contains a total of 711 scene images covering four different scripts (Latin, Chinese, Kannada, and Hangul).
2D HeLa is a dataset of fluorescence microscopy images of HeLa cells stained with various organelle-specific fluorescent dyes. The images include 10 organelles, which are DNA (Nuclei), ER (Endoplasmic reticulum), Giantin, (cis/medial Golgi), GPP130 (cis Golgi), Lamp2 (Lysosomes), Mitochondria, Nucleolin (Nucleoli), Actin, TfR (Endosomes), Tubulin. The purpose of the dataset is to train a computer program to automatically identify sub-cellular organelles.
The unsupervised Labeled Lane MArkerS dataset (LLAMAS) is a dataset for lane detection and segmentation. It contains over 100,000 annotated images, with annotations of over 100 meters at a resolution of 1276 x 717 pixels. The Unsupervised Llamas dataset was annotated by creating high definition maps for automated driving including lane markers based on Lidar.
VIPER is a benchmark suite for visual perception. The benchmark is based on more than 250K high-resolution video frames, all annotated with ground-truth data for both low-level and high-level vision tasks, including optical flow, semantic instance segmentation, object detection and tracking, object-level 3D scene layout, and visual odometry. Ground-truth data for all tasks is available for every frame. The data was collected while driving, riding, and walking a total of 184 kilometers in diverse ambient conditions in a realistic virtual world.
This is a synthetic dataset for defect detection on textured surfaces. It was originally created for a competition at the 2007 symposium of the DAGM (Deutsche Arbeitsgemeinschaft für Mustererkennung e.V., the German chapter of the International Association for Pattern Recognition). The competition was hosted together with the GNSS (German Chapter of the European Neural Network Society).
French Wikipedia is a dataset used for pretraining the CamemBERT French language model. It uses the official 2019 French Wikipedia dumps
The PieAPP dataset is a large-scale dataset used for training and testing perceptually-consistent image-error prediction algorithms. The dataset can be downloaded from: server containing a zip file with all data (2.2GB) or Google Drive (ideal for quick browsing).
The IBM-Rank-30k is a dataset for the task of argument quality ranking. It is a corpus of 30,497 arguments carefully annotated for point-wise quality.
A question type classification dataset with 6 classes for questions about a person, location, numeric information, etc. The test split has 500 questions, and the training split has 5452 questions.