3,275 machine learning datasets
3,275 dataset results
The Drag100 dataset is introduced in the paper "GoodDrag: Towards Good Practices for Drag Editing with Diffusion Models"¹. This dataset is a new contribution to the benchmarking of drag editing¹.
The current industrial pipeline includes 315 dynamic industrial scenarios, which can be categorized into three types: QR codes, text, and products. To enhance the diversity, we have different films with diverse material properties, coverage areas, film thicknesses, and levels of wrinkling. The film exhibits significant variability across each scenario. On the other hand, to ensure the stability of the industrial imaging pipeline, we maintained a consistent intensity level for the industrial light source and fixed the distance between the camera and the object flow. This helps to minimize the influence of errors external to the industrial system.
We introduce the IDRCell100K image dataset, a collection of biological images, purposefully curated from the extensive and varied Image Data Resource platform. Our selection, based on metadata provided with these experiments, covered various microscopy techniques to encapsulate a diverse array of imaging modalities, ensuring the dataset's breadth in representing biological information. Efforts were made to minimize experimental and imaging biases, striving for a balanced representation up to a feasible extent, thereby reducing dependency on each image modality or experiment.
In this dataset, various objects are arranged on a white table. A UR5e robot picks and place a target object specified on the title of the video/image sequence. Videos under auto- folder are collected with automatic operation of the robot. Videos under human- folders are collected with the tele-operation of the robot. Ground-truth tracking bounding boxes are generated with STARK, and when the target exits the camera frame, the bounding box estimation is switched to [-1, -1, -1, -1], indicating target not shown.
Click to add a brief description of the dataset (Markdown and LaTeX enabled).
Click to add a brief description of the dataset (Markdown and LaTeX enabled).
We present the SJTU Multispectral Object Detection (SMOD) dataset for detection. The dataset has 8676 infrared visible image pairs. Within this dataset, 8042 pedestrians, 10478 riders, 6501 bicycles, and 6422 cars are annotated. The degree of occlusion of all objects is meticulously annotated. The dataset with low sampling rate has dense rider and pedestrian objects and contains rich illumination variations in its 3298 pairs of images of night scenarios.
An image sequence dataset of growing snowflakes in HDF5 format. Generated by the Gravner-Griffeath LCA model for snow crystal growth. Useful for modeling crystal growth with neural networks.
PatternCom is a composed image retrieval benchmark based on PatternNet. PatternNet is a large-scale high-resolution remote sensing image retrieval dataset. There are 38 classes and each class has 800 images of size 256×256 pixels. In PatternCom, we select some classes to be depicted in query images, and add a query text that defines an attribute relevant to that class. For instance, query images of “swimming pools” are combined with text queries defining “shape” as “rectangular”, “oval”, and “kidney-shaped”. In total, PatternCom includes six attributes consisted of up to four different classes each. Each attribute can be associated with two to five values per class. The number of positives ranges from 2 to 1345 and there are more than 21k queries in total.
The FashionFail dataset comprises 2,495 high-resolution images (2400x2400 pixels) of products found on e-commerce websites. It is designed to address the limitations of existing state-of-the-art fashion parsing models. FashionFail consists of a diverse set of online shopping images with categories that are compatible with the established Fashionpedia dataset. The dataset is divided into training, validation, and test sets, consisting of 1,344, 150, and 1,001 images, respectively.
Introduction This dataset was gathered during the Vid2RealHRI study of humans’ perception of robots' intelligence in the context of an incidental Human-Robot encounter. The dataset contains participants' questionnaire responses to four video study conditions, namely Baseline, Verbal, Body language, and Body language + Verbal. The videos depict a scenario where a pedestrian incidentally encounters a quadruped robot trying to enter a building. The robot uses verbal commands or body language to try to ask for help from the pedestrian in different study conditions. The differences in the conditions were manipulated using the robot’s verbal and expressive movement functionalities.
fruit-SALAD is a synthetic image dataset with 10,000 generated images of fruit depictions. This combined semantic category and style benchmark comprises 100 instances each of 10 easily recognizable fruit categories and 10 easy distinguishable styles.
As a first step towards building models that can recognise immune cells in WSIs, we introduce Immunocto, a high-resolution (40 x magnification) massive database of 2,310,257 immune cells distributed across 4 immune cell subtypes (CD4 T-cells, CD8+ T-cells, B-cells, and macrophages). To our knowledge, Immunocto is the largest available dataset of immune cells extracted from H\&E WSIs by an order of magnitude. All models trained with this database can be tried at www.octopath.ai
LADI Overview The Low Altitude Disaster Imagery (LADI) dataset was created to address the relative lack of annotated post-disaster aerial imagery in the computer vision community. Low altitude post-disaster aerial imagery from small planes and UAVs can provide high-resolution imagery to emergency management agencies to help them prioritize response efforts and perform damage assessments. In order to accelerate their workflow, computer vision can be used to automatically identify images that contain features of interest, including infrastructure such as buildings and roads, damage to such infrastructure, and hazards such as floods or debris.
The TCB-DS dataset is a specialized collection of microscopic images focusing on the automatic recognition of cyanobacteria genera. This dataset was meticulously compiled to address the challenges associated with the varying image qualities due to differences in contrast, resolution, size, lighting, and the presence of noise in the original images. It includes 2,591 images with varying dimensions, ranging from a minimum of 11 × 41 pixels to a maximum of 5184 × 3456 pixels.
We introduce a set of 425 panoramic X-rays with Human annotated Bounding Boxes and Polygons, the 425 images are a subset of UFBA-UESC Dental Dataset. This dataset can be extensively used for detection and segmentation tasks for Dental Panoramic X-rays. Refer to Description for understanding the organisation of annotations and panoramic X-rays. The dataset contains distribution of panoramic X-rays across ten different categories.
Abstract: Through digitization, maintaining and promoting cultural heritage is being strengthened. Concerning this background, this study presents a new Indonesia cultural events dataset and automatic image classification for cultural events. The dataset was developed using the Flickr image platform, and the five cultural events image was collected including the Baliem Festival, Jember Fashion Festival, Nyepi Festival, Pacu Jawi, and Pasola Festival. Further, Convolutional Neural Networks (CNN) was developed for the classification method. A comparison of CNN models (VGG16 and VGG19) using several optimization configurations was performed to get the best model. The results showed that the VGG16 with image augmentation and dropout regularization technique performed best with 94.66% accuracy. This study hoped to support the heritage's digital documentation process and preserve Indonesia's cultural heritage.
The ULS23 training dataset contains 38,693 diverse lesions from chest-abdomen-pelvis CT examinations. For the challenge, we introduced two novel 3D annotated datasets targeting lesions in the pancreas and bones, which are traditionally challenging to segment. Additionally, we aggregate 10 publicly available datasets with a lesion segmentation component into a single, easily accessible data repository.
This Quantum Dots Stability Diagrams (QDSD) Dataset aggregates experimental stability diagrams of quantum dots from different research groups.
RoomSpace: a new benchmark designed to evaluate language models on spatial reasoning tasks demanding spatial relation knowledge and multi-hop reasoning. RoomSpace encompasses a comprehensive range of qualitative spatial relationships, including topological, directional, and distance relations. These relationships are presented from various viewpoints, with differing levels of granularity and density of relational constraints to simulate real-world complexities. This approach promotes a more accurate assessment of language models' capabilities in spatial reasoning tasks.