Datasets

19,997 machine learning datasets

19,997 dataset results

ICDAR 2019 (cTDaR)

Table is a compact and efficient form for summarizing and presenting correlative information in handwritten and printed archival documents, scientific journals, reports, financial statements and so on. Table recognition is fundamental for the extraction of information from structured documents. The ICDAR 2019 cTDaR evaluates two aspects of table analysis: table detection and recognition. The participating methods will be evaluated on a modern dataset and archival documents with printed and handwritten tables present.

2 papers1 benchmarks

wifi_data (WiFi Data for HMM Anomaly Detection)

Wi-Fi dataset: the dataset may be downloaded from this link. If you use this dataset, please cite the following reference:

2 papers0 benchmarks

OBJ-MDA

The dataset contains images of 16 artworks included in the cultural site “Galleria Regionale di Palazzo Bellomo2”. The collection covers different types of artworks, as well as books, sculptures and paintings. The dataset three domains: i) synthetic images generated from a 3D model of the cultural site and automatically labeled during the generation process; ii) real images collected by 10 visitors with a HoloLens device and manually labeled; iii) realimages collected by the same visitors with a GoPro and manually labeled.

2 papers1 benchmarks

CERBERUS DARPA Subterranean Challenge Datasets

Dataset link: https://github.com/leggedrobotics/cerberus_darpa_subt_datasets

2 papers0 benchmarks

Online retail dataset

This is a transnational data set which contains all the transactions occurring between 01/12/2010 and 09/12/2011 for a UK-based and registered non-store online retail. The company mainly sells unique all-occasion gifts. Many customers of the company are wholesalers. https://archive.ics.uci.edu/ml/datasets/online+retail

2 papers0 benchmarksTexts

CAP (Consented Activities of People)

The Consented Activities of People (CAP) dataset is a fine grained activity dataset for visual AI research curated using the Visym Collector platform. The CAP dataset contains annotated videos of fine-grained activity classes of consented people. Videos are recorded from mobile devices around the world from a third person viewpoint looking down on the scene from above, containing subjects performing every day activities. Videos are annotated with bounding box tracks around the primary actor along with temporal start/end frames for each activity instance, and distributed in vipy json format. An interactive visualization and video summary is available for review in the dataset distribution site.

2 papers0 benchmarksVideos

Long Video Dataset (3X)

We randomly selected three videos from the Internet, that are longer than 1.5K frames and have their main objects continuously appearing. Each video has 20 uniformly sampled frames manually annotated for evaluation. Each video has been played back and forth to generate videos that are three times as long.

2 papers9 benchmarks

BC7 NLM-Chem (BioCreative VII NLM-Chem)

Full-text chemical identification and indexing in PubMed articles.

2 papers3 benchmarksBiomedical, Texts

UHDM

The first ultra-high-definition image demoireing dataset, consisting of 4,500 4K resolution training pairs and 500 standard 4K resolution validation pairs.

2 papers2 benchmarks

Breast Lesion Detection in Ultrasound Videos (CVA-Net)

The breast lesion detection in ultrasound videos dataset uses a clip-level and video-level feature aggregated network (CVA-Net) and consists of 188 ultrasound videos, of which 113 are labeled malignant and 75 benign. Overall these consist of 25,272 ultrasound images in total with the number of images for each video varying from 28 to 413. 150 videos were used for training, 38 for testing. The primary intended use case would be for computer-aided breast cancer diagnosis, supporting systems to assist radiologists.

2 papers0 benchmarksImages, Medical, Videos

27 Class ASL Sign Language (27 Class American Sign Language-Based Dataset)

This 27 Class American Sign Language-based dataset consists of photographs collected from 173 individuals asked to display gestures with their hands. Using a camera, these were taken to a 3024 by 3024 pixels frame size within RGB color space. 130 photos were taken from each person, 5 per class (minor changes on sample sizes in classes can be observed) - 26 classes containing phrases, letters, and numbers with a 27th class null category made up of 314 images for control purposes. The main motivation was contributing to technology development use cases that can reduce the communication challenges faced speech-impaired people with new data to meet the diversity and sample size necessary for intelligent computer vision studies and sign language applications.

2 papers0 benchmarks

AIH (Amodal InterHand)

AIH is created for hand deocclusion and removal.

2 papers0 benchmarks

PCSOD

It is a new proposed dataset for point cloud salient object detection that has 2000 training samples and 872 testing samples.

2 papers0 benchmarks

AnimeCeleb

We present a novel Animation CelebHeads dataset (AnimeCeleb) to address an animation head reenactment. Different from previous animation head datasets, we utilize 3D animation models as the controllable image samplers, which can provide a large amount of head images with their corresponding detailed pose annotations. To facilitate a data creation process, we build a semi-automatic pipeline leveraging an open 3D computer graphics software with a developed annotation system. After training with the AnimeCeleb, recent head reenactment models produce high-quality animation head reenactment results, which are not achievable with existing datasets. Furthermore, motivated by metaverse application, we propose a novel pose mapping method and architecture to tackle a cross-domain head reenactment task. During inference, a user can easily transfer one's motion to an arbitrary animation head. Experiments demonstrate the usefulness of the AnimeCeleb to train animation head reenactment models, and t

2 papers0 benchmarksImages

GBUSV (Gallbladder Ultrasound Videos)

Description GBUSV is a un-annotated dataset consisting of ultrasound videos of of patients with either of a malignant or a non-malignant gallbladder. The ultrasound videos were obtained from patients referred to the radiology department of PGIMER, Chandigarh (a high-input hospital in Northern India) for abdominal ultrasound examinations of suspected gallbladder pathologies. Patients were at fasting of at least 6 hours. A 1-5 MHz curved array transducer (C-1-5D, Logiq S8, GE Healthcare) was used. The scanning intended to include the entire gallbladder and the lesion or pathology. The length of the video sequences varies from 43 to 888 frames. The dataset consists of 32 malignant and 32 non-malignant videos containing a total of 12,251 and 3,549 frames, respectively. The video frames are cropped from the center to anonymize the patient information and annotations. The processed frame sizes are of size 360x480 pixels.

2 papers0 benchmarks

RFMiD (Retinal Fundus MultiDisease Image Dataset)

According to the WHO, World report on vision 2019, the number of visually impaired people worldwide is estimated to be 2.2 billion, of whom at least 1 billion have a vision impairment that could have been prevented or is yet to be addressed. The world faces considerable challenges in terms of eye care, including inequalities in the coverage and quality of prevention, treatment, and rehabilitation services. Early detection and diagnosis of ocular pathologies would enable forestall of visual impairment. One challenge that limits the adoption of a computer-aided diagnosis tool by the ophthalmologist is, the sight-threatening rare pathologies such as central retinal artery occlusion or anterior ischemic optic neuropathy and others are usually ignored. In the past two decades, many publicly available datasets of color fundus images have been collected with a primary focus on diabetic retinopathy, glaucoma, and age-related macular degeneration, and few other frequent pathologies. The challe

2 papers0 benchmarks

GAFA (Gaze from afar dataset)

We introduce a new dataset of annotated surveillance videos of freely moving people taken from a distance in both indoor and outdoor scenes. The videos are captured with multiple cameras placed in eight different daily environments. People in the videos undergo large pose variations and are frequently occluded by various environmental factors. Most important, their eyes are mostly not clearly visible as is often the case in surveillance videos. We introduce the first rigorously annotated dataset of 3D gaze directions of freely moving people captured from afar.

2 papers0 benchmarks3D, Videos

Swissmetro

A Stated Preference Survey on mode choice https://transp-or.epfl.ch/documents/technicalReports/CS_SwissmetroDescription.pdf

2 papers0 benchmarks

The Game of 2048

The 2048 game task involves training an agent to achieve high scores in the game 2048 (Wikipedia)

2 papers1 benchmarksEnvironment

Ultra-processed Food Dataset

The raw data are obtained from an industrial plant for ultra-processed food production. The sampling was carried out every 5 minutes while the total production cycle takes approximately 3 hours, from raw ingredients to final semi- finished products. The extracted data represent approximately 80 days of production. Variables 2 − 14 belonging to 4 specific phases of the process and influence the qualitative variable 17. Variables 15 and 16 are external variables not controlled by the process which affect the final product. It should also be noted that some variation may be due to changes in raw materials, in production flow (variable 1) or to possible reconfiguration between weeks. However while the magnitude of effects may change between weeks, the causal relationships are dictated by the plant and process dynamics and are consistent (at the best of potential un-cofounder and faults) throughout the production .

2 papers0 benchmarksGraphs, Time series

PreviousPage 327 of 1000Next