19,997 machine learning datasets
19,997 dataset results
This dataset comprises a collection of stellarator configurations used to train the model over multiple iterations. Within the 1_dataset folder, you’ll find the initial dataset, which was constructed using the Near-Axis Expansion method, leveraging the pyQSC package. Subsequent files in this dataset were generated following the methodology outlined in the accompanying research paper.
The Kvasir-VQA dataset is an extended dataset derived from the HyperKvasir and Kvasir-Instrument datasets, augmented with question-and-answer annotations. This dataset is designed to facilitate advanced machine learning tasks in gastrointestinal (GI) diagnostics, including image captioning, Visual Question Answering (VQA) and text-based generation of synthetic medical images.
A benchmark designed to evaluate MLLMs’ proficiency in understanding inter-object relationships and textual content.
This dataset includes reviews (ratings, text, helpfulness votes), product metadata (descriptions, category information, price, brand, and image features), and links (also viewed/also bought graphs).
This dataset includes reviews (ratings, text, helpfulness votes), product metadata (descriptions, category information, price, brand, and image features), and links (also viewed/also bought graphs).
FindingEmo is an image dataset containing annotations for 25k images, specifically tailored to Emotion Recognition. Contrary to existing datasets, it focuses on complex scenes depicting multiple people in various naturalistic, social settings, with images being annotated as a whole, thereby going beyond the traditional focus on faces or single individuals. Annotated dimensions include Valence, Arousal and Emotion label, with annotations gathered using Prolific. Together with the annotations, we release the list of URLs pointing to the original images, as well as all associated source code.
Multilingual automatic speech recognition (ASR) in the medical domain serves as a foundational task for various downstream applications such as speech translation, spoken language understanding, and voice-activated assistants. This technology enhances patient care by enabling efficient communication across language barriers, alleviating specialized workforce shortages, and facilitating improved diagnosis and treatment, particularly during pandemics. In this work, we introduce MultiMed, the first multilingual medical ASR dataset, along with the first collection of small-to-large end-to-end medical ASR models, spanning five languages: Vietnamese, English, German, French, and Mandarin Chinese. To our best knowledge, MultiMed stands as the world's largest medical ASR dataset across all major benchmarks: total duration, number of recording conditions, number of accents, and number of speaking roles. Furthermore, we present the first multilinguality study for medical ASR, which includes repr
Click to add a brief description of the dataset (Markdown and LaTeX enabled).
Test-driven benchmark to challenge LLMs to write JavaScript React application
EC-FUNSD is introduced in [arXiv:2402.02379] as a benchmark of semantic entity recognition (SER) and entity linking (EL), designed for the entity-centric robustness evaluation of pre-trained text-and-layout models (PTLMs).
ROOR is a reading order prediction (ROP) benchmark which annotates layout reading order as ordering relations.
VoxSim is a perceptual voice similarity dataset created to develop perceptual speaker similarity evaluation systems. This dataset contains 69,409 scores for 41,578 pairs of utterances.
Distributional MIPLIB is a dataset of Mixed Integer Linear Programming (MILP) instances designed to advance research on learning to optimize \url{https://www.arxiv.org/abs/2406.06954}. It is a curated dataset of MILP distributions from 13 domains, classified into different hardness levels. The links for downloading the distributions are provided in the webpage of each domain.
A large scale dataset of scientific records from PubMed for scientific keyphrase generation. The dataset has three sizes: 500k, 2million and 5.6million documents
OAM-TCD is a dataset of around 5k aerial images from around the world to support robust tree detection algorithms. Full details can be found on the linked HuggingFace repository.
The human brain receives nutrients and oxygen through an intricate network of blood vessels. Pathology affecting small vessels, at the mesoscopic scale, represents a critical vulnerability within the cerebral blood supply and can lead to severe conditions, such as Cerebral Small Vessel Diseases. The advent of 7 Tesla MRI systems has enabled the acquisition of higher spatial resolution images, making it possible to visualise such vessels in the brain. However, the lack of publicly available annotated datasets has impeded the development of robust, machine learning-driven segmentation algorithms. To address this, the SMILE-UHURA challenge was organised. This challenge, held in conjunction with the ISBI 2023, in Cartagena de Indias, Colombia, aimed to provide a platform for researchers working on related topics. The SMILE-UHURA challenge addresses the gap in publicly available annotated datasets by providing an annotated dataset of Time-of-Flight angiography acquired with 7T MRI. This dat
Data variables and description. Parameters/Features Abbreviation Type Measurement Date Date year-month-day – Rented Bike count Count Continuous 0, 1, 2, .. ., 3556 Hour Hour Continuous 0, 1, 2, .. ., 23 Temperature Temp Continuous ◦C Humidity Hum Continuous % Windspeed Wind
CV-Cities comprises $223,736$ ground panoramic images and an equal number of satellite images all accompanied by high-precision GPS coordinates. These images represent sixteen representative cities across five continents. The ground images are $360^{\circ}$ panorama images with a resolution of $4,096 \times 2,048$ pixels, while the resolution of satellite images is $746 \times 746$ pixels, and are captured at a zoom level of $20$. The spatial resolution is $0.298 m$, corresponding to a latitude and longitude range of $0.002 \times 0.002^{\circ}$ (about $222 \times 222 m$). The images of each city in the dataset can be used for training and testing purposes.
Cross-View Geo-Localisation within urban regions is challenging in part due to the lack of geo-spatial structuring within current datasets and techniques. We propose utilising graph representations to model sequences of local observations and the connectivity of the target location. Modelling as a graph enables generating previously unseen sequences by sampling with new parameter configurations. SpaGBOL contains 98,855 panoramic streetview images across different seasons, and 19,771 corresponding satellite images from 10 mostly densely populated international cities. This translates to 5 panoramic images and one satellite image per graph node. Downloading instructions below.
Border Gateway Protocol (BGP) Network describes the Internet's inter-domain structure, where nodes represent the autonomous systems and edges are the business relationships between nodes. The features contain basic properties, e.g., the location and topology information (e.g., transit degree), and labels means the types of autonomous systems.