TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Datasets

19,997 machine learning datasets

Filter by Modality

  • Images3,275
  • Texts3,148
  • Videos1,019
  • Audio486
  • Medical395
  • 3D383
  • Time series298
  • Graphs285
  • Tabular271
  • Speech199
  • RGB-D192
  • Environment148
  • Point cloud135
  • Biomedical123
  • LiDAR95
  • RGB Video87
  • Tracking78
  • Biology71
  • Actions68
  • 3d meshes65
  • Tables52
  • Music48
  • EEG45
  • Hyperspectral images45
  • Stereo44
  • MRI39
  • Physics32
  • Interactive29
  • Dialog25
  • Midi22
  • 6D17
  • Replay data11
  • Financial10
  • Ranking10
  • Cad9
  • fMRI7
  • Parallel6
  • Lyrics2
  • PSG2

19,997 dataset results

Stellarators

This dataset comprises a collection of stellarator configurations used to train the model over multiple iterations. Within the 1_dataset folder, you’ll find the initial dataset, which was constructed using the Near-Axis Expansion method, leveraging the pyQSC package. Subsequent files in this dataset were generated following the methodology outlined in the accompanying research paper.

3 papers0 benchmarks

Kvasir-VQA (A Text-Image Pair GI Tract Dataset)

The Kvasir-VQA dataset is an extended dataset derived from the HyperKvasir and Kvasir-Instrument datasets, augmented with question-and-answer annotations. This dataset is designed to facilitate advanced machine learning tasks in gastrointestinal (GI) diagnostics, including image captioning, Visual Question Answering (VQA) and text-based generation of synthetic medical images.

3 papers0 benchmarksImages, Medical, Tabular, Texts

P2GB

A benchmark designed to evaluate MLLMs’ proficiency in understanding inter-object relationships and textual content.

3 papers0 benchmarksImages, Texts

Amazon Office Products (Amazon Office Products 5-core)

This dataset includes reviews (ratings, text, helpfulness votes), product metadata (descriptions, category information, price, brand, and image features), and links (also viewed/also bought graphs).

3 papers3 benchmarks

Amazon Digital Music (Amazon Digital Music 5-core)

This dataset includes reviews (ratings, text, helpfulness votes), product metadata (descriptions, category information, price, brand, and image features), and links (also viewed/also bought graphs).

3 papers3 benchmarksImages, Texts

FindingEmo

FindingEmo is an image dataset containing annotations for 25k images, specifically tailored to Emotion Recognition. Contrary to existing datasets, it focuses on complex scenes depicting multiple people in various naturalistic, social settings, with images being annotated as a whole, thereby going beyond the traditional focus on faces or single individuals. Annotated dimensions include Valence, Arousal and Emotion label, with annotations gathered using Prolific. Together with the annotations, we release the list of URLs pointing to the original images, as well as all associated source code.

3 papers0 benchmarksImages

MultiMed

Multilingual automatic speech recognition (ASR) in the medical domain serves as a foundational task for various downstream applications such as speech translation, spoken language understanding, and voice-activated assistants. This technology enhances patient care by enabling efficient communication across language barriers, alleviating specialized workforce shortages, and facilitating improved diagnosis and treatment, particularly during pandemics. In this work, we introduce MultiMed, the first multilingual medical ASR dataset, along with the first collection of small-to-large end-to-end medical ASR models, spanning five languages: Vietnamese, English, German, French, and Mandarin Chinese. To our best knowledge, MultiMed stands as the world's largest medical ASR dataset across all major benchmarks: total duration, number of recording conditions, number of accents, and number of speaking roles. Furthermore, we present the first multilinguality study for medical ASR, which includes repr

3 papers0 benchmarks

UltraSafety

Click to add a brief description of the dataset (Markdown and LaTeX enabled).

3 papers0 benchmarks

WebApp1K-React

Test-driven benchmark to challenge LLMs to write JavaScript React application

3 papers1 benchmarksTexts

EC-FUNSD

EC-FUNSD is introduced in [arXiv:2402.02379] as a benchmark of semantic entity recognition (SER) and entity linking (EL), designed for the entity-centric robustness evaluation of pre-trained text-and-layout models (PTLMs).

3 papers2 benchmarksImages, Texts

ROOR

ROOR is a reading order prediction (ROP) benchmark which annotates layout reading order as ordering relations.

3 papers1 benchmarksImages, Texts

VoxSim

VoxSim is a perceptual voice similarity dataset created to develop perceptual speaker similarity evaluation systems. This dataset contains 69,409 scores for 41,578 pairs of utterances.

3 papers0 benchmarks

Distributional MIPLIB

Distributional MIPLIB is a dataset of Mixed Integer Linear Programming (MILP) instances designed to advance research on learning to optimize \url{https://www.arxiv.org/abs/2406.06954}. It is a curated dataset of MILP distributions from 13 domains, classified into different hardness levels. The links for downloading the distributions are provided in the webpage of each domain.

3 papers0 benchmarks

KPBiomed

A large scale dataset of scientific records from PubMed for scientific keyphrase generation. The dataset has three sizes: 500k, 2million and 5.6million documents

3 papers0 benchmarks

OAM-TCD

OAM-TCD is a dataset of around 5k aerial images from around the world to support robust tree detection algorithms. Full details can be found on the linked HuggingFace repository.

3 papers0 benchmarksImages

SMILE-UHURA (Small Vessel Segmentation at Mesoscopic Scale from Ultra-High Resolution 7T Magnetic Resonance Angiogram)

The human brain receives nutrients and oxygen through an intricate network of blood vessels. Pathology affecting small vessels, at the mesoscopic scale, represents a critical vulnerability within the cerebral blood supply and can lead to severe conditions, such as Cerebral Small Vessel Diseases. The advent of 7 Tesla MRI systems has enabled the acquisition of higher spatial resolution images, making it possible to visualise such vessels in the brain. However, the lack of publicly available annotated datasets has impeded the development of robust, machine learning-driven segmentation algorithms. To address this, the SMILE-UHURA challenge was organised. This challenge, held in conjunction with the ISBI 2023, in Cartagena de Indias, Colombia, aimed to provide a platform for researchers working on related topics. The SMILE-UHURA challenge addresses the gap in publicly available annotated datasets by providing an annotated dataset of Time-of-Flight angiography acquired with 7T MRI. This dat

3 papers0 benchmarksImages, MRI, Medical

Seoul Bike Sharing Demand (Sathishkumar V E)

Data variables and description. Parameters/Features Abbreviation Type Measurement Date Date year-month-day – Rented Bike count Count Continuous 0, 1, 2, .. ., 3556 Hour Hour Continuous 0, 1, 2, .. ., 23 Temperature Temp Continuous ◦C Humidity Hum Continuous % Windspeed Wind

3 papers0 benchmarks

CV-Cities

CV-Cities comprises $223,736$ ground panoramic images and an equal number of satellite images all accompanied by high-precision GPS coordinates. These images represent sixteen representative cities across five continents. The ground images are $360^{\circ}$ panorama images with a resolution of $4,096 \times 2,048$ pixels, while the resolution of satellite images is $746 \times 746$ pixels, and are captured at a zoom level of $20$. The spatial resolution is $0.298 m$, corresponding to a latitude and longitude range of $0.002 \times 0.002^{\circ}$ (about $222 \times 222 m$). The images of each city in the dataset can be used for training and testing purposes.

3 papers2 benchmarksImages

SpaGBOL (Spatial-Graph-Based Orientated Cross-View Geo-Localisation)

Cross-View Geo-Localisation within urban regions is challenging in part due to the lack of geo-spatial structuring within current datasets and techniques. We propose utilising graph representations to model sequences of local observations and the connectivity of the target location. Modelling as a graph enables generating previously unseen sequences by sampling with new parameter configurations. SpaGBOL contains 98,855 panoramic streetview images across different seasons, and 19,771 corresponding satellite images from 10 mostly densely populated international cities. This translates to 5 panoramic images and one satellite image per graph node. Downloading instructions below.

3 papers4 benchmarksGraphs, Images

BGP (Border Gateway Protocol (BGP) Network)

Border Gateway Protocol (BGP) Network describes the Internet's inter-domain structure, where nodes represent the autonomous systems and edges are the business relationships between nodes. The features contain basic properties, e.g., the location and topology information (e.g., transit degree), and labels means the types of autonomous systems.

3 papers1 benchmarksGraphs
PreviousPage 293 of 1000Next