19,997 machine learning datasets
19,997 dataset results
An evaluation protocol for face verification focusing on a large intra-pair image quality difference.
COLDataset is a dataset to facilitate Chinese offensive language detection and model evaluation. It include a Chinese offensive language dataset containing 37k annotated sentences.
We now introduce IndicGLUE, the Indic General Language Understanding Evaluation Benchmark, which is a collection of various NLP tasks as de- scribed below. The goal is to provide an evaluation benchmark for natural language understanding ca- pabilities of NLP models on diverse tasks and mul- tiple Indian languages.
Wukong is a large-scale Chinese cross-modal dataset for benchmarking different multi-modal pre-training methods to facilitate the Vision-Language Pre-training (VLP). This dataset contains 100 million Chinese image-text pairs from the web. This base query list is taken from and is filtered according to the frequency of Chinese words and phrases.
The SKM-TEA dataset pairs raw quantitative knee MRI (qMRI) data, image data, and dense labels of tissues and pathology for end-to-end exploration and evaluation of the MR imaging pipeline. This 1.6TB dataset consists of raw-data measurements of ~25,000 slices (155 patients) of anonymized patient knee MRI scans, the corresponding scanner-generated DICOM images, manual segmentations of four tissues, and bounding box annotations for sixteen clinically relevant pathologies.
This is the Multi-Axis Temporal RElations for Start-points (i.e., MATRES) dataset
See paper:
Raw sensor dataset where each sequence captures 7 (few contain 6) images (RAW and JPG) taken by different focal lengths.
CholecT45 is a subset of CholecT50 consisting of 45 videos from the Cholec80 dataset. It is the first public release of part of CholecT50 dataset. CholecT50 is a dataset of 50 endoscopic videos of laparoscopic cholecystectomy surgery introduced to enable research on fine-grained action recognition in laparoscopic surgery. It is annotated with 100 triplet classes in the form of <instrument, verb, target>.
The General Robust Image Task (GRIT) Benchmark is an evaluation-only benchmark for evaluating the performance and robustness of vision systems across multiple image prediction tasks, concepts, and data sources. GRIT hopes to encourage our research community to pursue the following research directions:
During the COVID-19 coronavirus epidemic, almost everyone wears a facial mask, which poses a huge challenge to face recognition. Traditional face recognition systems may not effectively recognize the masked faces, but removing the mask for authentication will increase the risk of virus infection. Inspired by the COVID-19 pandemic response, the widespread requirement that people wear protective face masks in public places has driven a need to understand how face recognition technology deals with occluded faces, often with just the periocular area and above visible.
VideoLQ consists of videos downloaded from various video hosting sites such as Flickr and YouTube, with a Creative Common license.
Node classification on Cornell with 60%/20%/20% random splits for training/validation/test.
TripClick is a large-scale dataset of click logs in the health domain, obtained from user interactions of the Trip Database health web search engine.
Node classification on Texas with 60%/20%/20% random splits for training/validation/test.
Node classification on Cornell with the fixed 48%/32%/20% splits provided by Geom-GCN.
FaithDial is a new benchmark for hallucination-free dialogues, by editing hallucinated responses in the Wizard of Wikipedia (WoW) benchmark.
Abstract: The task for this dataset is to forecast the spatio-temporal traffic volume based on the historical traffic volume and other features in neighboring locations.
Jester Gesture Recognition dataset includes 148,092 labeled video clips of humans performing basic, pre-defined hand gestures in front of a laptop camera or webcam. It is designed for training machine learning models to recognize human hand gestures like sliding two fingers down, swiping left or right and drumming fingers.
GSV-Cities is a large-scale dataset for training deep neural network for the task of Visual Place Recognition.