19,997 machine learning datasets
19,997 dataset results
The PAX-Ray++ dataset uses pseudo-labeled thorax CTs to enable the segmentation of anatomy in Chest X-Rays. By projecting the CTs to a 2D plane, we gather fine-grained annotated imaages resembling radiographs. It contains 7,377 frontal and lateral view images each with 157 anatomy classes and over 2 million annotated instances.
PCam200 is a public pathological H&E image dataset from Patch Camelyon in 200 microns by 512 px made in the same manner from Camelyon2016 challenge dataset.
SegPANDA200 is a public pathological H&E image dataset from segmentation task on PANDA challenge in 200 microns by 512px made in the same manner from PANDA challenge dataset .
Intraoral 3D scans analysis is a fundamental aspect of Computer-Aided Dentistry (CAD) systems, playing a crucial role in various dental applications, including teeth segmentation, detection, labeling, and dental landmark identification. Accurate analysis of 3D dental scans is essential for orthodontic and prosthetic treatment planning, as it enables automated processing and reduces the need for manual adjustments by dental professionals. However, developing robust automated tools for these tasks remains a significant challenge due to the limited availability of high-quality public datasets and benchmarks. This article introduces Teeth3DS+, the first comprehensive public benchmark designed to advance the field of intraoral 3D scan analysis. Developed as part of the 3DTeethSeg 2022 and 3DTeethLand 2024 MICCAI challenges, Teeth3DS+ aims to drive research in teeth identification, segmentation, labeling, 3D modeling, and dental landmarks identification. The dataset includes at least 1,800 i
This dataset presents a set of large-scale ridesharing Dial-a-Ride Problem (DARP) instances. The instances were created as a standardized set of ridesharing DARP problems for the purpose of benchmarking and comparing different solution methods.
FaMoS is a dynamic 3D head dataset from 95 subjects, each performing 28 motion sequences. The sequences comprise of six prototypical expressions (i.e., Anger, Disgust, Fear, Happiness, Sadness, and Surprise), two head rotations (left/right and up/down), and diverse facial motions, including extreme and asymmetric expressions. Each sequence is recorded at 60 fps. In total, FaMoS contains around 600K 3D head meshes (i.e., ~225 frames per sequence). For each frame, registrations in FLAME meshes are publicly available.
GPT-generated and hum-written academic abstract corpus with over 600k samples in Computer Science, Physics, and Humanity Science.
The YouTube8M-MusicTextClips dataset consists of over 4k high-quality human text descriptions of music found in video clips from the YouTube8M dataset.
A dataset of cartoon video clips. For each video clip, the presence or absence of each feature was marked by the annotators.
Our proposed Synthetic-to-Real benchmark for more practical visual DA (termed S2RDA) includes two challenging transfer tasks of S2RDA-49 and S2RDA-MS-39. In each task, source/synthetic domain samples are synthesized by rendering 3D models from ShapeNet. The used 3D models are in the same label space as the target/real domain and each class has 12K rendered RGB images. The real domain of S2RDA-49 comprises 60,535 images of 49 classes, collected from ImageNet validation set, ObjectNet, VisDA-2017 validation set, and the web. For S2RDA-MS-39, the real domain collects 41,735 natural images exclusive for 39 classes from MetaShift, which contain complex and distinct contexts, e.g., object presence (co-occurrence of different objects), general contexts (indoor or outdoor), and object attributes (color or shape), leading to a much harder task. Compared to VisDA-2017, our S2RDA contains more categories, more realistically synthesized source domain data coming for free, and more complicated targ
DACCORD is a new dataset dedicated to the task of automatically detecting contradictions between sentences in French.
State-level data for the US economy through the lens of consumer spending (Credit/Debit Spending) . The dataset is enriched with state-level Economic Dynamics and Policy Responses. Specifically, we further enriched the data with the state-level policies as an indication of extreme events (e.g., the state’s business closure order).
ABSTRACT Development of the Complex-Valued (CV) deep learning architectures has enabled us to exploit the amplitude and phase components of the CV Synthetic Aperture Radar (SAR) data. However, most of the available annotated SAR datasets provide only the amplitude information (Only detected SAR data) and disregard the phase information. The lack of high-quality and large-scale annotated CV-SAR datasets is a significant challenge for developing CV deep learning algorithms in remote sensing. In order to tackle this problem, a large-scale semantically annotated CV-SAR dataset is developed using the Single Look Complex (SLC) StripMap (SM) Sentinel-1 (S1) SAR data in two polarization channels (HH and HV) for Complex-Valued Deep Learning applications (S1SLC_CVDL). The S1SLC_CVDL dataset comprises 276,571 CV-SAR patches (100×100 pixel), derived from three scenes acquired over Chicago and Houston in the Uniate States, and Sao Paulo in Brazil in May 2021. These three scenes are selected to cov
Estimating camera motion in deformable scenes poses a complex and open research challenge. Most existing non-rigid structure from motion techniques assume to observe also static scene parts besides deforming scene parts in order to establish an anchoring reference. However, this assumption does not hold true in certain relevant application cases such as endoscopies. To tackle this issue with a common benchmark, we introduce the Drunkard’s Dataset, a challenging collection of synthetic data targeting visual navigation and reconstruction in deformable environments. This dataset is the first large set of exploratory camera trajectories with ground truth inside 3D scenes where every surface exhibits non-rigid deformations over time. Simulations in realistic 3D buildings lets us obtain a vast amount of data and ground truth labels, including camera poses, RGB images and depth, optical flow and normal maps at high resolution and quality.
A large-scale benchmark with 1605 high-resolution, well-annotated images, featuring more complex scenes and a wider range of DOF settings.
Single cortical neurons as deep artificial neural networks This dataset contains training and testing subsets of the input/output relationship of a single cortical layer 5 pyramidal cell (L5PC) neuron at 1ms single spike temporal resolution. The data is obtained via a simulation that contains all of the currently (2021) known and well modeled "messy biological details" that relate to the operation of single neurons in the brain.
MiniWob++ is a suite of web-browser based tasks introduced in Liu et al. (2018) (an extension of the earlier MiniWob task suite (Shi et al., 2017)). Tasks range from simple button clicking to complex form-filling, for example, to book a flight when given particular instructions (Fig. 1a). Programmatic rewards are available for each task, permitting standard reinforcement learning techniques.
Homepage | GitHub
Synthetic dataset of over 13,000 images of damaged and intact parcels with full 2D and 3D annotations in the COCO format. For details see our paper and for visual samples our project page.
Kang et al.'s Markovian model for treatment adherence in obstructive sleep apnea.