3,275 machine learning datasets
3,275 dataset results
The Autonomous-driving StreAming Perception (ASAP) benchmark is a benchmark to evaluate the online performance of vision-centric perception in autonomous driving. It extends the 2Hz annotated nuScenes dataset by generating high-frame-rate labels for the 12Hz raw images.
RegDB-C* is an evaluation set that consists of algorithmically generated corruptions applied to the RegDB test-set, and especially to both the visible and the thermal data. In comparison with the RegDB-C dataset proposed by Chen et al. in "Benchmarks for Corruption Invariant Person Re-identification" paper, our dataset is used in a multimodal manner and do not consider visible data corruptions only. Used corruptions are globally the same; Noise: Gaussian, shot, impulse, and speckle; Blur: defocus, frosted glass, motion, zoom, and Gaussian; Weather: snow, frost, fog, brightness, spatter, and rain; Digital: contrast, elastic, pixel, JPEG compression, and saturate. However, corruptions are adapted to respect the thermal modality encoding, and brightness is not used to corrupt the thermal data. Five severity levels are considered per corruption.
SYSU-MM01-C* is an evaluation set that consists of algorithmically generated corruptions applied to the SYSU-MM01 test-set, and especially to both the visible and the thermal data. In comparison with the SYSU-MM01-C dataset proposed by Chen et al. in "Benchmarks for Corruption Invariant Person Re-identification" paper, our dataset is used in a multimodal manner and do not consider visible data corruptions only. Used corruptions are globally the same; Noise: Gaussian, shot, impulse, and speckle; Blur: defocus, frosted glass, motion, zoom, and Gaussian; Weather: snow, frost, fog, brightness, spatter, and rain; Digital: contrast, elastic, pixel, JPEG compression, and saturate. However, corruptions are adapted to respect the thermal modality encoding, and brightness is not used to corrupt the thermal data. Five severity levels are considered per corruption.
ThermalWORLD-C* is an evaluation set that consists of algorithmically generated corruptions applied to the ThermalWORLD test-set, and especially to both the visible and the thermal data. In comparison with the corruption approach proposed by Chen et al. in "Benchmarks for Corruption Invariant Person Re-identification" paper, our dataset is used in a multimodal manner and do not consider visible data corruptions only. Used corruptions are globally the same; Noise: Gaussian, shot, impulse, and speckle; Blur: defocus, frosted glass, motion, zoom, and Gaussian; Weather: snow, frost, fog, brightness, spatter, and rain; Digital: contrast, elastic, pixel, JPEG compression, and saturate. However, corruptions are adapted to respect the thermal modality encoding, and brightness is not used to corrupt the thermal data. Five severity levels are considered per corruption.
There was no predefined dataset of party symbols to be usedas a benchmark. We curated a dataset from various nationaland regional websites owned by the ECI. The dataset consists of symbols (image files) of 49 National and State registered parties approved by the ECI. For each image of theoriginal party symbol, 18 different distortions and transformations were created as variations to the training data. Each image is of the dimension 180 x 180. The final labeled dataset consists of 931 images of party symbols with their corresponding party names as the labels.
We created this robust and custom light field dataset in order to assist light field researchers in using SOTA machine learning algorithms for a variety of light field tasks such as depth estimation, synthetic aperture imaging, and more.
Fine-Grained Vehicle Detection (FGVD) is a dataset for fine-grained vehicle detection captured from a moving camera mounted on a car. The FGVD dataset is challenging as it has vehicles in complex traffic scenarios with intra-class and inter-class variations in types, scale, pose, occlusion, and lighting conditions.
The dataset of images is built upon a collection of 454 samples kindly provided by the urology department of the Hospital Universitary de Bellvitge (Barcelona, Spain) in the time span of several years. They cover all the main 9 classes but cystine, for which just 4 samples were available so we discarded this class as mentioned above. As for the rest, we tried to get all second scheme classes balanced and, at the same time, to record as much examples as possible to account for intraclass.
A dataset for image editing containing >450k samples of:
RGB Arabic Alphabet Sign Language (AASL) dataset
UICaption is a dataset of 114k UI images paired with descriptions of their functionality. It is designed for the tasks of UI action entailment, instruction-based UI image retrieval, grounding referring expressions, and UI entity recognition.
ConsInv is a stereo RGB + IMU dataset designed for Dynamic SLAM testing and contains two subsets:
Analogical reasoning is fundamental to human cognition and holds an important place in various fields. However, previous studies mainly focus on single-modal analogical reasoning and ignore taking advantage of structure knowledge. We introduce the new task of multimodal analogical reasoning over knowledge graphs, which requires multimodal reasoning ability with the help of background knowledge. Our dataset MARS contains 10,685 training, 1,228 validation and 1,415 test instances.
The 'Me 163' was a Second World War fighter airplane and a result of the German air force secret developments. One of these airplanes is currently owned and displayed in the historic aircraft exhibition of the 'Deutsches Museum' in Munich, Germany. To gain insights with respect to its history, design and state of preservation, a complete CT scan was obtained using an industrial XXL-computer tomography scanner at Fraunhofer EZRT .
The dataset is for research on the label distribution shift between multiple domain adaptations. We use Cl, Pr, and Rw to resample two reverse long-tailed distributions and one Gaussian d for each of them for BTDA with label shift.
This dataset contains a collection of 131 X-ray CT scans of pieces of modeling clay (Play-Doh) with various numbers of stones inserted, retrieved in the FleX-ray lab at CWI. The dataset consists of 5 parts. It is intended as raw supplementary material to reproduce the CT reconstructions and subsequent results in the paper titled "A tomographic workflow enabling deep learning for X-ray based foreign object detection". The dataset can be used to set up other CT-based experiments concerning similar objects with variations in shape and composition.
This dataset contains a collection of 235800 X-ray projections of 131 pieces of modeling clay (Play-Doh) with various numbers of stones inserted. The dataset is intended as an extensive and easy-to-use training dataset for supervised machine learning driven object detection. The ground truth locations of the stones are included.
The Robot Tracking Benchmark (RTB) is a synthetic dataset that facilitates the quantitative evaluation of 3D tracking algorithms for multi-body objects. It was created using the procedural rendering pipeline BlenderProc. The dataset contains photo-realistic sequences with HDRi lighting and physically-based materials. Perfect ground truth annotations for camera and robot trajectories are provided in the BOP format. Many physical effects, such as motion blur, rolling shutter, and camera shaking, are accurately modeled to reflect real-world conditions. For each frame, four depth qualities exist to simulate sensors with different characteristics. While the first quality provides perfect ground truth, the second considers measurements with the distance-dependent noise characteristics of the Azure Kinect time-of-flight sensor. Finally, for the third and fourth quality, two stereo RGB images with and without a pattern from a simulated dot projector were rendered. Depth images were then recons
A simulated dataset built in Unreal Engine 4 with AirSim. Designed for visual point cloud change detection. Including GT point clouds before changes and after changes. Besides, 4 trajectories with stereo camera and IMU data are recorded for change detection task.
This dataset was acquired in a retrospective study from a cohort of pediatric patients admitted with abdominal pain to Children’s Hospital St. Hedwig in Regensburg, Germany. Multiple abdominal B-mode ultrasound images were acquired for most patients, with the number of views varying from 1 to 15. The images depict various regions of interest, such as the abdomen’s right lower quadrant, appendix, intestines, lymph nodes and reproductive organs. Alongside multiple US images for each subject, the dataset includes information encompassing laboratory tests, physical examination results, clinical scores, such as Alvarado and pediatric appendicitis scores, and expert-produced ultrasonographic findings. Lastly, the subjects were labeled w.r.t. three target variables: diagnosis (appendicitis vs. no appendicitis), management (surgical vs. conservative) and severity (complicated vs. uncomplicated or no appendicitis). The study was approved by the Ethics Committee of the University of Regensburg (