Datasets

3,275 machine learning datasets

3,275 dataset results

Changen2-S1-15k

Click to add a brief description of the dataset (Markdown and LaTeX enabled).

AIDERV2 (Aerial Image Dataset for Emergency Response Applications (version 2))

The dataset contains aerial images containing three commonly occurring natural disasters earthquake/collapsed buildings, flood, wildfire/fire, and a normal class; do not reflect any disaster. It consist of 167723 aerial images divided into 4 classes. The dataset is an extension of the AIDER dataset (Aerial Image Dataset for Emergency Response Applications).

1 papers1 benchmarksImages

WPS-Dataset：A Benchmark for Wood Plate Segmentation in Bark Removal Processing (WPS-Dataset)

Wood plate bark removal processing is critical for ensuring the quality of wood processing and its products. To address the issue of lack of datasets available for the application of deep learning methods to this field, and to fill the research gap of deep learning methods in the application field of wood plate bark removal equipment, a benchmark for wood plate segmentation in bark removal processing is proposed in this study.

1 papers0 benchmarksImages

ASRD (Anime Style Recognition Dataset)

A well-labeled challenging dataset, to facilitate the research on style recognition on anime images by collecting images from 190 anime and cartoon works covering 93 years from 13 countries and regions, 2D and 3D work into consideration concurrently. We choose at most ten roles for each work. All the images are obtained from the Internet. The images in the LSASRD dataset are mainly from existing anime and cartoons. Moreover, some are from comics or games of the same anime series. Unlike illustration or video datasets, we provide a moderate amount of contextual information in a wide variety of styles. LSASRD requires the ability of context understanding of image models.

1 papers0 benchmarksImages, Tables

TimberVision

The TimberVision dataset consists of more than 2k annotated RGB images and contains a total of 51k trunk components including cut and lateral surfaces, thereby surpassing any existing dataset in this domain in terms of both quantity and detail by a large margin. The dataset can be used to train oriented object detection and instance segmentation and evaluate the influence of multiple scene parameters on model performance. Additionally, a generic framework is provided to fuse the components detected by the models for both tasks into unified trunk representations. Furthermore, geometric properties are derived automatically and multi-object tracking is applied to further enhance robustness.

1 papers0 benchmarksImages, Videos

RAOS (Rethinking Abdominal Organ Segmentation)

Rethinking Abdominal Organ Segmentation (RAOS) in the clinical scenario: A robustness evaluation benchmark with challenging cases.

1 papers0 benchmarks3D, Biomedical, Images, Medical

OSN-transmission_mini_CelebA

This is the paper “DF-RAP: A Robust Adversarial Perturbation for Defending against Deepfakes in Real-world Social Network Scenarios" OSN-transmission CelebA sampling dataset collected by manual upload and download. This dataset includes 30,000 facial images of size 256×256 transmitted through online social networks (OSN) and their corresponding original images. Among them, Facebook, Twitter, WeChat and Weibo were selected as the transmission OSN, with 7500 images each.

1 papers0 benchmarksImages

Visual 3D shape matching dataset

Dataset for testing the ability of Vision Language Models (LVM) to recognize and match 3D objects of the exact same 3D shapes but with different orientation/materials/textures/ environments and light conditions.

1 papers0 benchmarksImages, Texts

M²ConceptBase

M²ConceptBase is a concept-centric multimodal knowledge base designed to bridge the gap between visual and linguistic semantics. It features 951K images and 152K concepts, with each concept linked to an average of 6.27 images and a detailed textual description.

1 papers0 benchmarksImages, Texts

CISOL (Construction Industry Steel Ordering Lists Dataset)

The Construction Industry Steel Ordering Lists (CISOL) dataset comprises table-centric, real-world documents from the construction industry, annotated to facilitate the testing and training of deep learning models for table detection (TD) and table structure recognition (TSR).

1 papers0 benchmarksImages

PatentDesc-355K

Click to add a brief description of the dataset (Markdown and LaTeX enabled).

1 papers0 benchmarksImages, Texts

MP-IDB (MP-IDB: The Malaria Parasite Image Database for Image Processing and Analysis)

MP-IDB comprises four species of Malaria parasites: Falciparum, Malariae, Ovale, Vivax. For each species, there are four distinct stages of life, described in the filenames as follows:

1 papers5 benchmarksImages

M5-Malaria Dataset

In this project, we tried to make malaria detection easily possible at a low cost. We present M5-malaria Dataset which is the first-ever dataset that is across microscopes and across magnifications. Malaria, a fatal but curable disease claims hundreds of thousands of lives every year. Early and correct diagnosis is vital to avoid health complexities, however, it depends upon the availability of costly microscopes and trained experts to analyze blood-smear slides. Deep learning-based methods have the potential to not only decrease the burden of experts but also improve diagnostic accuracy on low-cost microscopes. However, this is hampered by the absence of a reasonable size dataset. One of the most challenging aspects is the reluctance of the experts to annotate the dataset at low magnification on low-cost microscopes. We present a dataset to further the research on malaria microscopy over the low-cost microscopes at low magnification. Our large-scale dataset consists of images of blood

1 papers5 benchmarksImages

S-BIAD843 (Individual 3D cell shapes of Drosophila Wing Disc)

Late third instar wing imaginal discs were cultured in Shields and Sang M3 media (Sigma) supplemented with 2% FBS (Sigma), 1% pen/strep (Gibco), 3ng/ml ecdysone (Sigma) and 2ng/ml insulin (Sigma). Wing discs were cultured in 35mm fluorodishes (WPI) under 12mm filters (Millicell), as described in https://doi.org/10.1038%2Fs41567-019-0618-1

1 papers0 benchmarks3D, Images

ENSeg

ENSeg Dataset Overview This dataset represents an enhanced subset of the ENS dataset. The ENS dataset comprises image samples extracted from the enteric nervous system (ENS) of male adult Wistar rats (Rattus norvegicus, albius variety), specifically from the jejunum, the second segment of the small intestine.

1 papers1 benchmarksBiology, Images, Medical

Liver-US (Liver Ultrasound Dataset for Medical Image Classification)

The Liver-US dataset is a comprehensive collection of high-quality ultrasound images of the liver, including both normal and abnormal cases. This dataset is designed to facilitate research in medical image classification, with a focus on liver-related conditions. It includes a diverse range of ultrasound images acquired from multiple clinical settings, providing a robust foundation for developing and validating machine learning models in medical image analysis. Detailed Dataset Description

1 papers1 benchmarksBiomedical, Images, Medical

SimBEV

The SimBEV dataset is a collection of 320 scenes spread across all 11 CARLA maps and contains data from a variety of sensors, including five camera types (RGB, semantic segmentation, instance segmentation, depth, and optical flow), lidar, semantic lidar, radar, GNSS, and IMU, along with 3D object bounding boxes and accurate bird's-eye view (BEV) ground truth. With each scene lasting 16 seconds at a frame rate of 20 Hz, the SimBEV dataset contains 102,400 annotated frames, over 8 million 3D object bounding boxes, and more than 2.5 billion BEV ground truth labels.

1 papers72 benchmarksImages, LiDAR, Point cloud

TUMTraffic-VideoQA

TUMTraffic-VideoQA is a novel dataset designed to understand spatiotemporal video in complex roadside traffic scenarios. The dataset comprises 1,000 videos, featuring 85,000 multiple-choice QA pairs, 2,300 object captioning, and 5,700 object grounding annotations, encompassing diverse real-world conditions such as adverse weather and traffic anomalies. By incorporating tuple-based spatiotemporal object expressions, TUMTraffic-VideoQA unifies three essential tasks—multiple-choice video question answering, referred object captioning, and spatiotemporal object grounding—within a cohesive evaluation framework.

1 papers0 benchmarksImages, Texts, Videos

TextAtlas5M

We introduce TextAtlas5M, a dataset specifically designed for training and evaluating multimodal generation models on dense-text image generation.

1 papers0 benchmarksImages, Texts

PhysiCo

Physical concept understanding benchmark.

1 papers0 benchmarksImages, Texts

PreviousPage 145 of 164Next