3,275 machine learning datasets
3,275 dataset results
MIMIC-CXR-LT. We construct a single-label, long-tailed version of MIMIC-CXR in a similar manner. MIMIC-CXR is a multi-label classification dataset with over 200,000 chest X-rays labeled with 13 pathologies and a “No Findings” class. The resulting MIMIC-CXR-LT dataset contains 19 classes, of which 10 are head classes, 6 are medium classes, and 3 are tail classes. MIMIC-CXR-LT contains 111,792 images labeled with one of 18 diseases, with 87,493 training images and 23,550 test set images. The validation and balanced test sets contain 15 and 30 images per class, respectively.
ARMBench is a large-scale, object-centric benchmark dataset for robotic manipulation in the context of a warehouse. ARMBench contains images, videos, and metadata that corresponds to 235K+ pick-and-place activities on 190K+ unique objects. The data is captured at different stages of manipulation, i.e., pre-pick, during transfer, and after placement.
Accurate 3D human pose estimation is essential for sports analytics, coaching, and injury prevention. However, existing datasets for monocular pose estimation do not adequately capture the challenging and dynamic nature of sports movements. In response, we introduce SportsPose, a large-scale 3D human pose dataset consisting of highly dynamic sports movements. With more than 176,000 3D poses from 24 different subjects performing 5 different sports activities, SportsPose provides a diverse and comprehensive set of 3D poses that reflect the complex and dynamic nature of sports movements. Contrary to other markerless datasets we have quantitatively evaluated the precision of SportsPose by comparing our poses with a commercial marker-based system and achieve a mean error of 34.5 mm across all evaluation sequences. This is comparable to the error reported on the commonly used 3DPW dataset. We further introduce a new metric, local movement, which describes the movement of the wrist and ankle
CDS2K is a benchmark for Concealed scene understanding (CSU), which is a hot computer vision topic aiming to perceive objects with camouflaged properties. It is a concealed defect segmentation dataset from the five well-known defect segmentation databases. It contains five sub-databases: MVTecAD, NEU, CrackForest, KolektorSDD, and MagneticTile. The defective regions are highlighted with red rectangles.
The LIMUC dataset is the largest publicly available labeled ulcerative colitis dataset that compromises 11276 images from 564 patients and 1043 colonoscopy procedures. Three experienced gastroenterologists were involved in the annotation process, and all images are labeled according to the Mayo endoscopic score (MES).
OVQA contains 19,020 medical visual question and answer pairs generated from 2,001 medical images collected from 2,212 EMRs in Orthopedics.
A Game Of Sorts is a collaborative image ranking task. Players are asked to rank a set of images based on a given sorting criterion. The game provides a framework for the evaluation of visually grounded language understanding and generation of referring expressions in multimodal dialogue settings.
Contactless fingerprint matching using smartphone cameras can alleviate major challenges of traditional fingerprint systems including hygienic acquisition, portability and presentation attacks. However, development of practical and robust contactless fingerprint matching techniques is constrained by the limited availability of large scale real-world datasets. To motivate further advances in contactless fingerprint matching across sensors, we introduce the RidgeBase benchmark dataset. RidgeBase consists of more than 15,000 contactless and contact-based fingerprint image pairs acquired from 88 individuals under different background and lighting conditions using two smartphone cameras and one flatbed contact sensor. Unlike existing datasets, RidgeBase is designed to promote research under different matching scenarios that include Single Finger Matching and Multi-Finger Matching for both contactless-to-contactless (CL2CL) and contact-to-contactless (C2CL) verification and identification. F
The Small Object Detection for Spotting Birds (SOD4SB) dataset is a dataset consisting of 39,070 images including 137,121 bird instances. The SOD4SD dataset contains a wide variety of small bird types and a variety of scenes.
It includes 47,978 butterfly images with a 4-level label-hierarchy. Hierarchy of labels from the ETHEC dataset across 4 levels: family, sub-family, genus and species. 6 family -> 21 sub-family -> 135 genus -> 561 species
Description The consists of 92 groups of video clips with 113, 918 high resolution frames taken by two drones and 63 groups of video clips with 145, 875 high resolution frames taken by three drones.
For change detection tasks, current open-source datasets mainly focus on building extraction (e.g., WHU building dataset and LEVIR-CD dataset) (Chen and Shi, 2020; Ji et al., 2018) and urban development monitoring (e.g., SECOND dataset, Google dataset and CDD dataset) (Yang et al., 2022; Peng et al., 2021; Lebedev et al., 2018), whereas datasets for natural disaster monitoring have been seldom investigated.
The dataset consists of over 350,000 public domain patent drawings collected from the United States Patent and Trademark Office (USPTO). The whole collection consists of a total of 45,000 design patents published between January 2018 and June 2019.
OVDEval includes 9 sub-tasks and introduces evaluations on commonsense knowledge, attribute understanding, position understanding, object relation comprehension, and more. The dataset is meticulously created to provide hard negatives that challenge models' true understanding of visual and linguistic input.
Pretrain: 200k Instruction: 100k
Click to add a brief description of the dataset (Markdown and LaTeX enabled).
Click to add a brief description of the dataset (Markdown and LaTeX enabled).
Abstract The classification and recognition of foliar diseases is an increasingly developing field of research, where the concepts of machine and deep learning are used to support agricultural stakeholders. Datasets are the fuel for the development of these technologies. In this paper, we release and make publicly available the field dataset collected to diagnose and monitor plant symptoms, called DiaMOS Plant, consisting of 3505 images of pear fruit and leaves affected by four diseases. In addition, we perform a comparative analysis of existing literature datasets designed for the classification and recognition of leaf diseases, highlighting the main features that maximize the value and information content of the collected data. This study provides guidelines that will be useful to the research community in the context of the selection and construction of datasets.
Psychological trait estimation from external factors such as movement and appearance is a challenging and long-standing problem in psychology, and is principally based on the psychological theory of embodiment. To date, attempts to tackle this problem have utilized private small-scale datasets with intrusive body-attached sensors. Potential applications of an automated system for psychological trait estimation include estimation of occupational fatigue and psychology, and marketing and advertisement. In this work, we propose PsyMo (Psychological traits from Motion), a novel, multi-purpose and multi-modal dataset for exploring psychological cues manifested in walking patterns. We gathered walking sequences from 312 subjects in 7 different walking variations and 6 camera angles. In conjunction with walking sequences, participants filled in 6 psychological questionnaires, totalling 17 psychometric attributes related to personality, self-esteem, fatigue, aggressiveness and mental health. W
A SAR version of the EuroSAT dataset. The images were collected from Sentinel-1 GRD products (two bands VV and VH) based on the geocoordinates of the EuroSAT images.