3,275 machine learning datasets
3,275 dataset results
The Turkish Scene Text Recognition (TS-TR) dataset was primarily developed to fill the gap in non-English text recognition resources, specifically addressing the unique challenges presented by the Turkish language, such as special characters and diacritics. This dataset mirrors real-world conditions with texts displayed in various fonts, sizes, orientations, and complex backgrounds from multiple urban and rural environments. Such diversity ensures the training of models that can generalize across different scenarios, including varying lighting conditions and complex visual layouts.
A real-world dataset for multi-image super-resolution that matches low-resolution Sentinel-2 images with high-resolution WorldView-2 images. The dataset has been introduced with the paper: - Kowaleczko, P., Tarasiewicz, T., Ziaja, M., Kostrzewa, D., Nalepa, J., Rokita, P., & Kawulok, M. (2023). A real-world benchmark for Sentinel-2 multi-image super-resolution. Scientific Data, 10(1), 644.
The dataset has railway track images of two types: normal and defective
Sakha-TB is a de-identified image dataset of frontal chest X-rays (CXR), collected through collaboration with several medical institutions in the Republic of Sakha (Yakutia, Russia). The set contains 400 normal X-rays and 400 X-rays with manifestations of pulmonary tuberculosis, balanced to some extent by age and gender, in 16-bit and 8-bit lossless PNG format, converted directly from DICOM files without any changes.
Dataset contains annotated photographs of pear orchard for object detection tasks using YOLO architecture. It contains only one class "Pear fruits". This dataset is distributed under license (CC BY 4.0).
Dataset contains annotated photographs of pear fruitlets for object detection tasks using YOLO architecture. It has only one class "Pear fruitlet". The digital images of pear fruitlets were collected in the experimental site of the Institute of Horticulture (LatHort) with cultivars ‘Suvenirs’, ‘Vasarine Sviestine’ and ‘Mramornaya’ on seedling rootstocks ‘Kazraushu’ with planting distances 4×5 m (500 trees per 1 ha). (Krimūnu parish, Dobeles district, Latvia: 56.610169, 23.305956). The collection of fruitlet images of ‘Suvenirs’, ‘Vasarine Sviestine’ and ‘Mramornaya’ was done at the beginning of August (79 days after full bloom). The collection of digital images was carried out using a photo camera of mobile device Huawei P 40: 50 MP Ultra Vision Camera (Wide Angle, f/1.9 aperture) + 16 MP Ultra-Wide Angle Camera (f/2.2 aperture) + 8 MP Telephoto Camera (f/2.4 aperture, OIS), the image size: 3000×4000 px; 5.0 MP. The collection of images was carried out in field conditions in 2022, in t
The photo fixation of cherry fruitlets was done in the LatHort orchard in Dobele, at the development of fruit (BBCH stage 72). BBCH-scale describes the phenological development of grapes: 7 - development of fruit; 72 - fruit size up to 20 mm. Two photo images were taken for each tree – perpendicularly, in a tree-facing view and in an oblique view. The images were annotated using the tool makesense.ai. Then the annotated images 3008x2000 were automatically cropped out on 640x640 images with overlap 30% and validated manually. The images were saved in YOLO format.
The photo fixation of cherry fruits was done in the LatHort orchard in Dobele, at the beginning of fruit coloration (BBCH stage 81). BBCH-scale for grapes describes the phenological development of grapes: 8 - ripening of berries; 81 - beginning of ripening: berries begin to develop variety-specific colour. Two photo images were taken for each tree – perpendicularly, in a tree-facing view and in an oblique view. The images were annotated using the tool makesense.ai. Then the annotated images 6016x4000 were automatically cropped out on 640x640 images with overlap 30% and validated manually. The images were saved in YOLO format.
The photo fixation of apple fruitlets was done in the LatHort orchard in Dobele, at the development of fruit (BBCH stage 76-78). BBCH-scale describes the phenological development of grapes: 7 - development of fruit; 76 - fruit about 60% final size; 78 - fruit about 80% final size. Two photo images were taken for each tree – perpendicularly, in a tree-facing view and in an oblique view. The images were annotated using the tool makesense.ai. Then the annotated images 3008x2000 were automatically cropped out on 640x640 images with overlap 30% and validated manually. The images were saved in YOLO format.
The photo fixation of apple fruits was done in the LatHort orchard in Dobele, at the maturity of fruit and seed (BBCH stage 81-85). BBCH-scale describes the phenological development of grapes: 8 - maturity of fruit and seed; 81 - beginning of ripening; 85 - advanced ripening. Two photo images were taken for each tree – perpendicularly, in a tree-facing view and in an oblique view. The images were annotated using the tool makesense.ai. Then the annotated images 3008x2000 were automatically cropped out on 640x640 images with overlap 30% and validated manually. The images were saved in YOLO format.
test
Towards automated analysis of large environments, hyperspectral sensors must be adapted into a format where they can be operated from mobile robots. In this dataset, we highlight hyperspectral datacubes collected from the Hyper-Drive imaging system. Our system collects and registers datacubes spanning the visible to shortwave infrared (660-1700 nm) in 33 wavelength channels. The system also simultaneously captures the ambient solar spectrum reflected off a white reference tile. The dataset consists of 500 labeled datacubes from on-road and off-road terrain compliant with the ATLAS.
The Fields2Benhmark dataset is a collection of 350 agricultural fields in vector format manually selected to test agricultural coverage path planning algorithms.
1. Candlestick Charts Candlestick charts are a type of financial chart used to represent the price movement of an asset (e.g., stocks, cryptocurrencies) over time. Each "candlestick" consists of: - Body: Represents the opening and closing prices. - Wicks (or Shadows): Represent the highest and lowest prices during the time period.
IQ testing has served as a foundational methodology for evaluating human cognitive capabilities, deliberately decoupling assessment from linguistic background, language proficiency, or domain-specific knowledge to isolate core competencies in abstraction and reasoning. Yet, artificial intelligence research currently lacks systematic benchmarks to quantify these critical cognitive dimensions in multimodal systems. To address this critical gap, we propose MM-IQ, a comprehensive evaluation framework comprising 2,710 meticulously curated test items spanning 8 distinct reasoning paradigms.
One of the most important aspects of robot scene understanding is semantic segmentation of external environments. Urban environment semantic segmentation has been extensively investigated by researchers and many real-world and synthetic datasets have been utilised to develop highly accurate segmentation results. However, the number of off-road datasets available for robot navigation research remains limited. To address this, we introduce a novel framework [1] to generate varied photorealistic synthetic off-road datasets capable of supporting multiple sensor modalities.
With the gradual maturity of UAV technology, it can provide extremely powerful support for smart agriculture and precise monitoring. Currently, there is no dataset related to green walnuts in the field of agricultural computer vision. Therefore, in order to promote the algorithm design in the field of agricultural computer vision, we used UAV to collect remote sensing data from 8 walnut sample plots. Considering that green walnuts have the characteristics of being affected by various lighting conditions and being occluded, we constructed a large-scale dataset with a higher fine-grained target feature - WalnutData. This dataset contains a total of 30,240 images and 7,062,080 instances, and there are 4 target categories: illuminated from the front and not occluded (A1), backlit and not occluded (A2), illuminated from the front and occluded (B1), and backlit and occluded (B2).
opendateset
Description:
RF100-VL is a multi-domain benchmark for object detection. The benchmark is designed to measure the extent to which model architectures can generalise to different domains, from medical imagery to defect detection to document feature identification. RF100-VL was introduced by researchers from Roboflow and Carnegie Mellon University in the paper "Roboflow100-VL: A Multi-Domain Object Detection Benchmark for Vision-Language Models".