19,997 machine learning datasets
19,997 dataset results
Table is a compact and efficient form for summarizing and presenting correlative information in handwritten and printed archival documents, scientific journals, reports, financial statements and so on. Table recognition is fundamental for the extraction of information from structured documents. The ICDAR 2019 cTDaR evaluates two aspects of table analysis: table detection and recognition. The participating methods will be evaluated on a modern dataset and archival documents with printed and handwritten tables present.
Wi-Fi dataset: the dataset may be downloaded from this link. If you use this dataset, please cite the following reference:
The dataset contains images of 16 artworks included in the cultural site “Galleria Regionale di Palazzo Bellomo2”. The collection covers different types of artworks, as well as books, sculptures and paintings. The dataset three domains: i) synthetic images generated from a 3D model of the cultural site and automatically labeled during the generation process; ii) real images collected by 10 visitors with a HoloLens device and manually labeled; iii) realimages collected by the same visitors with a GoPro and manually labeled.
Dataset link: https://github.com/leggedrobotics/cerberus_darpa_subt_datasets
This is a transnational data set which contains all the transactions occurring between 01/12/2010 and 09/12/2011 for a UK-based and registered non-store online retail. The company mainly sells unique all-occasion gifts. Many customers of the company are wholesalers. https://archive.ics.uci.edu/ml/datasets/online+retail
The Consented Activities of People (CAP) dataset is a fine grained activity dataset for visual AI research curated using the Visym Collector platform. The CAP dataset contains annotated videos of fine-grained activity classes of consented people. Videos are recorded from mobile devices around the world from a third person viewpoint looking down on the scene from above, containing subjects performing every day activities. Videos are annotated with bounding box tracks around the primary actor along with temporal start/end frames for each activity instance, and distributed in vipy json format. An interactive visualization and video summary is available for review in the dataset distribution site.
We randomly selected three videos from the Internet, that are longer than 1.5K frames and have their main objects continuously appearing. Each video has 20 uniformly sampled frames manually annotated for evaluation. Each video has been played back and forth to generate videos that are three times as long.
Full-text chemical identification and indexing in PubMed articles.
The first ultra-high-definition image demoireing dataset, consisting of 4,500 4K resolution training pairs and 500 standard 4K resolution validation pairs.
The breast lesion detection in ultrasound videos dataset uses a clip-level and video-level feature aggregated network (CVA-Net) and consists of 188 ultrasound videos, of which 113 are labeled malignant and 75 benign. Overall these consist of 25,272 ultrasound images in total with the number of images for each video varying from 28 to 413. 150 videos were used for training, 38 for testing. The primary intended use case would be for computer-aided breast cancer diagnosis, supporting systems to assist radiologists.
This 27 Class American Sign Language-based dataset consists of photographs collected from 173 individuals asked to display gestures with their hands. Using a camera, these were taken to a 3024 by 3024 pixels frame size within RGB color space. 130 photos were taken from each person, 5 per class (minor changes on sample sizes in classes can be observed) - 26 classes containing phrases, letters, and numbers with a 27th class null category made up of 314 images for control purposes. The main motivation was contributing to technology development use cases that can reduce the communication challenges faced speech-impaired people with new data to meet the diversity and sample size necessary for intelligent computer vision studies and sign language applications.
AIH is created for hand deocclusion and removal.
It is a new proposed dataset for point cloud salient object detection that has 2000 training samples and 872 testing samples.
We present a novel Animation CelebHeads dataset (AnimeCeleb) to address an animation head reenactment. Different from previous animation head datasets, we utilize 3D animation models as the controllable image samplers, which can provide a large amount of head images with their corresponding detailed pose annotations. To facilitate a data creation process, we build a semi-automatic pipeline leveraging an open 3D computer graphics software with a developed annotation system. After training with the AnimeCeleb, recent head reenactment models produce high-quality animation head reenactment results, which are not achievable with existing datasets. Furthermore, motivated by metaverse application, we propose a novel pose mapping method and architecture to tackle a cross-domain head reenactment task. During inference, a user can easily transfer one's motion to an arbitrary animation head. Experiments demonstrate the usefulness of the AnimeCeleb to train animation head reenactment models, and t
Description GBUSV is a un-annotated dataset consisting of ultrasound videos of of patients with either of a malignant or a non-malignant gallbladder. The ultrasound videos were obtained from patients referred to the radiology department of PGIMER, Chandigarh (a high-input hospital in Northern India) for abdominal ultrasound examinations of suspected gallbladder pathologies. Patients were at fasting of at least 6 hours. A 1-5 MHz curved array transducer (C-1-5D, Logiq S8, GE Healthcare) was used. The scanning intended to include the entire gallbladder and the lesion or pathology. The length of the video sequences varies from 43 to 888 frames. The dataset consists of 32 malignant and 32 non-malignant videos containing a total of 12,251 and 3,549 frames, respectively. The video frames are cropped from the center to anonymize the patient information and annotations. The processed frame sizes are of size 360x480 pixels.
According to the WHO, World report on vision 2019, the number of visually impaired people worldwide is estimated to be 2.2 billion, of whom at least 1 billion have a vision impairment that could have been prevented or is yet to be addressed. The world faces considerable challenges in terms of eye care, including inequalities in the coverage and quality of prevention, treatment, and rehabilitation services. Early detection and diagnosis of ocular pathologies would enable forestall of visual impairment. One challenge that limits the adoption of a computer-aided diagnosis tool by the ophthalmologist is, the sight-threatening rare pathologies such as central retinal artery occlusion or anterior ischemic optic neuropathy and others are usually ignored. In the past two decades, many publicly available datasets of color fundus images have been collected with a primary focus on diabetic retinopathy, glaucoma, and age-related macular degeneration, and few other frequent pathologies. The challe
We introduce a new dataset of annotated surveillance videos of freely moving people taken from a distance in both indoor and outdoor scenes. The videos are captured with multiple cameras placed in eight different daily environments. People in the videos undergo large pose variations and are frequently occluded by various environmental factors. Most important, their eyes are mostly not clearly visible as is often the case in surveillance videos. We introduce the first rigorously annotated dataset of 3D gaze directions of freely moving people captured from afar.
A Stated Preference Survey on mode choice https://transp-or.epfl.ch/documents/technicalReports/CS_SwissmetroDescription.pdf
The 2048 game task involves training an agent to achieve high scores in the game 2048 (Wikipedia)
The raw data are obtained from an industrial plant for ultra-processed food production. The sampling was carried out every 5 minutes while the total production cycle takes approximately 3 hours, from raw ingredients to final semi- finished products. The extracted data represent approximately 80 days of production. Variables 2 − 14 belonging to 4 specific phases of the process and influence the qualitative variable 17. Variables 15 and 16 are external variables not controlled by the process which affect the final product. It should also be noted that some variation may be due to changes in raw materials, in production flow (variable 1) or to possible reconfiguration between weeks. However while the magnitude of effects may change between weeks, the causal relationships are dictated by the plant and process dynamics and are consistent (at the best of potential un-cofounder and faults) throughout the production .