19,997 machine learning datasets
19,997 dataset results
OASum is a large-scale open-domain aspect-based summarization dataset which contains more than 3.7 million instances with around 1 million different aspects on 2 million Wikipedia pages.
Consists of 37,723/14,360 person/clothes images, with the resolution of 256x192. Each person has different poses. We split them into the train/test set 52,236/10,544 three-tuples, respectively. You can download the dataset at MPV(Google Drive)
This is a classification problem to distinguish between a signal process which produces supersymmetric particles and a background process which does not.
MultiSpider is a large multilingual text-to-SQL dataset which covers seven languages (English, German, French, Spanish, Japanese, Chinese, and Vietnamese).
The Argoverse 2 Lidar Dataset is a collection of 20,000 scenarios with lidar sensor data, HD maps, and ego-vehicle pose. It does not include imagery or 3D annotations. The dataset is designed to support research into self-supervised learning in the lidar domain, as well as point cloud forecasting.
Multilabeled News Dataset (MN-DS) is a dataset for news classification. It consists of 10,917 articles in 17 first-level and 109 second-level categories from 215 media sources.
ChemDisGene, a new dataset for training and evaluating multi-class multi-label biomedical relation extraction models.
The data we use include 366 monthly series, 427 quarterly series and 518 yearly series. They were supplied by both tourism bodies (such as Tourism Australia, the Hong Kong Tourism Board and Tourism New Zealand) and various academics, who had used them in previous tourism forecasting studies (please refer to the acknowledgements and details of the data sources and availability).
FES is an indoor dataset that can be used for evaluation of deep learning approaches. It consists of 301 top-view fisheye images from an indoor scene. Annotations include bounding boxes and instance segmentation masks for 6 classes.
DIBCO 2009 is the first International Document Image Binarization Contest organized in the context of ICDAR 2009 conference. The general objective of the contest is to identify current advances in document image binarization using established evaluation performance measures.
H-DIBCO 2010 is the International Document Image Binarization Contest which is dedicated to handwritten document images organized in conjunction with ICFHR 2010 conference. The general objective of the contest is to identify current advances in handwritten document image binarization using meaningful evaluation performance measures.
Analyzing the surgical workflow is a prerequisite for many applications in computer assisted surgery (CAS), such as context-aware visualization of navigation information, specifying the most probable tool required next by the surgeon or determining the remaining duration of surgery. Since laparoscopic surgeries are performed using an endoscopic camera, a video stream is always available during surgery, making it the obvious choice as input sensor data for workflow analysis. Moreover, this offers the opportunity for structured assessment of surgical skill for safety, teaching and quality management.
Cholecystectomy is a very common abdominal surgical procedure almost ubiquitously performed with a laparoscopic approach, hence guided by an endoscopic video. Deep learning models for LC video analysis have been developed with the aim of assisting surgeons during interventions, improving staff awareness and readiness, and facilitating postoperative documentation and research. . However, datasets and models for video semantic segmentation of LC are lacking. Recognizing fine-grained hepatocystic anatomy through semantic segmentation could help surgeons better assess the critical view of safety (CVS), a universally recommended technique consisting in well exposing anatomical landmarks to prevent bile duct injuries. Additionally, segmentation masks of hepatocystic structures could be leveraged by deep learning models for automatic assessment of CVS and surgical action recognition to improve their performance. We believe that generating a dataset for video semantic segmentation of hepatocy
jazznet is a dataset of piano patterns for music audio machine learning research. The dataset comprises chords, arpeggios, scales, and chord progressions in all keys of an 88-key piano and in all the inversions, for a total of 162520 labeled piano patterns, resulting in 95GB of data and more than 26k hours of audio. The data is also accompanied by Python scripts to enable the easy generation of new piano patterns beyond those present in the dataset. The data is broken down into small, medium, and large subsets, comprising 21516, 30328, and 52360 patterns, respectively (with all the chords, arpeggios, and scales being present in all subsets).
This is the set of instances use in the PACE 2018 competition, of optimal Steiner Tree computation. The instances are grouped into three tracks of 200 instances each, except for the third track which is only 199 instances. Each instance is an undirected graph.
Casual Conversations v2 (CCv2) is composed of over 5,567 participants (26,467 videos) and intended mainly to be used for assessing the performance of already trained models in computer vision and audio applications for the purposes permitted in our data license agreement. The videos feature paid individuals who agreed to participate in the project and explicitly provided Age, Gender, Language/Dialect, Geo-location, Disability, Physical adornments, Physical attributes labels themselves. The videos were recorded in Brazil, India, Indonesia, Mexico, Philippines, United States, and Vietnam with a diverse set of adults in various categories. A group of trained annotators labeled the participants’ apparent skin tone using the Fitzpatrick scale and Monk Scale, in addition to annotations of Voice timbre, Activity and Recording setups. Spoken words in all videos are either scripted (a sample paragraph from The Idiot by Fyodor Dostoevsky provided with the dataset) or nonscripted (answering one o
Med-EASi (Medical dataset for Elaborative and Abstractive Simplification), a uniquely crowdsourced and finely annotated dataset for supervised simplification of short medical texts. It contains 1979 expert-simple text pairs in medical domain, spanning a total of 4478 UMLS concepts across all text pairs. The dataset is annotated with four textual transformations: replacement, elaboration, insertion and deletion.
ECG200
Synthetic omnidirectional multi-view image dataset. Photo-realistic rendered images with Cycles engine.
Real-world omnidirectional multi-view image dataset.