TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Datasets

19,997 machine learning datasets

Filter by Modality

  • Images3,275
  • Texts3,148
  • Videos1,019
  • Audio486
  • Medical395
  • 3D383
  • Time series298
  • Graphs285
  • Tabular271
  • Speech199
  • RGB-D192
  • Environment148
  • Point cloud135
  • Biomedical123
  • LiDAR95
  • RGB Video87
  • Tracking78
  • Biology71
  • Actions68
  • 3d meshes65
  • Tables52
  • Music48
  • EEG45
  • Hyperspectral images45
  • Stereo44
  • MRI39
  • Physics32
  • Interactive29
  • Dialog25
  • Midi22
  • 6D17
  • Replay data11
  • Financial10
  • Ranking10
  • Cad9
  • fMRI7
  • Parallel6
  • Lyrics2
  • PSG2

19,997 dataset results

READ2016(line-level) (Line-level Handwritten Text Recognition on READ 2016)

This dataset arises from the READ project (Horizon 2020).

6 papers4 benchmarksImages, Texts

GTSinger (GTSinger: A Global Multi-Technique Singing Corpus with Realistic Music Scores for All Singing Tasks)

The scarcity of high-quality and multi-task singing datasets significantly hinders the development of diverse controllable and personalized singing tasks, as existing singing datasets suffer from low quality, limited diversity of languages and singers, absence of multi-technique information and realistic music scores, and poor task suitability. To tackle these problems, we present GTSinger, a large Global, multi-Technique, free-to-use, high-quality singing corpus with realistic music scores, designed for all singing tasks, along with its benchmarks. Particularly, (1) we collect 80.59 hours of high-quality songs, forming the largest recorded singing dataset; (2) 20 professional singers across nine languages offer diverse timbres and styles; (3) we provide controlled comparison and phoneme-level annotations of six singing techniques, helping technique modeling and control; (4) GTSinger offers realistic music scores, assisting real-world musical composition; (5) singing voices are accompa

6 papers0 benchmarksAudio, Music, Speech, Texts

GFF (Global Flood Forecasting)

Floods are among the most common and devastating natural hazards, imposing immense costs on our society and economy due to their disastrous consequences. Recent progress in weather prediction and spaceborne flood mapping demonstrated the feasibility of anticipating extreme events and reliably detecting their catas- trophic effects afterwards. However, these efforts are rarely linked to one another and there is a critical lack of datasets and benchmarks to enable the direct forecast- ing of flood extent. To resolve this issue, we curate a novel dataset enabling a timely prediction of flood extent. Furthermore, we provide a representative evaluation of state-of-the-art methods, structured into two benchmark tracks for forecasting flood inundation maps i) in general and ii) focused on coastal regions. Altogether, our dataset and benchmark provide a comprehensive platform for evaluating flood forecasts, enabling future solutions for this critical challenge. Data, code & models are shared a

6 papers0 benchmarksImages, Time series

Imagenet100

ImageNet-100 is a subset of ImageNet-1k Dataset from ImageNet Large Scale Visual Recognition Challenge 2012. It contains random 100 classes as specified in Labels.json file. Link: https://www.kaggle.com/datasets/ambityga/imagenet100

6 papers0 benchmarks

ShipSG (Ship Segmentation and Georeferencing Dataset)

The ShipSG dataset is the first public dataset of its kind for ship segmentation and georeferencing. The dataset has been made available for the development and evaluation of instance segmentation and georeferencing methods using computer vision and deep learning, thus advancing the research field of ship recognition for maritime situational awareness. ShipSG consists of 3505 images of a port location using two different cameras with static and oblique view. In total, 11625 ship masks grouped in seven ship classes were manually annotated. Moreover, each image contains one geographic annotation of one of the ship masks (latitude and longitude) present in the image.

6 papers0 benchmarksImages

California Housing Prices

Median house prices for California districts derived from the 1990 census.

6 papers6 benchmarksTabular

DiFF (Diffusion Facial Forgery Detection)

Detecting diffusion-generated images has recently grown into an emerging research area. Existing diffusion-based datasets predominantly focus on general image generation. However, facial forgeries, which pose a more severe social risk, have remained less explored thus far. To address this gap, this paper introduces DiFF, a comprehensive dataset dedicated to face-focused diffusion-generated images. DiFF comprises over 500,000 images that are synthesized using thirteen distinct generation methods under four conditions. In particular, this dataset leverages 30,000 carefully collected textual and visual prompts, ensuring the synthesis of images with both high fidelity and semantic consistency. We conduct extensive experiments on the DiFF dataset via a human test and several representative forgery detection methods. The results demonstrate that the binary detection accuracy of both human observers and automated detectors often falls below 30%, shedding light on the challenges in detecting d

6 papers0 benchmarks

EmbSpatial-Bench (EmbSpatial-Bench: Benchmarking Spatial Understanding for Embodied Tasks with Large Vision-Language Models)

The recent rapid development of Large Vision-Language Models (LVLMs) has indicated their potential for embodied tasks. However, the critical skill of spatial understanding in embodied environments has not been thoroughly evaluated, leaving the gap between current LVLMs and qualified embodied intelligence unknown. Therefore, we construct EmbSpatial-Bench, a benchmark for evaluating embodied spatial understanding of LVLMs. The benchmark is automatically derived from embodied scenes and covers 6 spatial relationships from an egocentric perspective. Below are a few examples.

6 papers2 benchmarks

SDWPF (A Dataset for Spatial Dynamic Wind Power Forecasting Challenge at KDD Cup 2022)

The unique Spatial Dynamic Wind Power Forecasting dataset: SDWPF, which includes the spatial distribution of wind turbines, as well as the dynamic context factors. Whereas, most of the existing datasets have only a small number of wind turbines without knowing the locations and context information of wind turbines at a fine-grained time scale. By contrast, SDWPF provides the wind power data of 134 wind turbines from a wind farm over half a year with their relative positions and internal statuses.

6 papers0 benchmarksTime series

MuChoMusic

MuChoMusic is a benchmark designed to evaluate music understanding in multimodal language models focused on audio. It includes 1,187 multiple-choice questions validated by human annotators, based on 644 music tracks from two publicly available music datasets. These questions cover a wide variety of genres and assess knowledge and reasoning across several musical concepts and their cultural and functional contexts. The benchmark provides a holistic evaluation of five open-source models, revealing challenges such as over-reliance on the language modality and highlighting the need for better multimodal integration.

6 papers0 benchmarksAudio, Music, Texts

Ultra-High Resolution Image Reconstruction Benchmark

Ultra-high definition benchmark (UHDBench) includes 2293 images at 2k resolution sourced from the ground-truth test sets of HRSOD, LIU4k, UAVid, UHDM, and UHRSD. Images smaller than 2560 × 1440 are excluded.

6 papers4 benchmarksImages

GEdit-Bench-EN

This dataset is a new benchmark, grounded in real-world usages is developed to support more authentic and comprehensive evaluation of image editing models.

6 papers3 benchmarksImages, Texts

Wikipedia Person and Animal Dataset

This dataset gathers 428,748 person and 12,236 animal infobox with descriptions based on Wikipedia dump (2018/04/01) and Wikidata (2018/04/12).

5 papers7 benchmarksTexts

MemexQA

A large, realistic multimodal dataset consisting of real personal photos and crowd-sourced questions/answers.

5 papers1 benchmarksImages, Texts

Species-800

Species-800 is a corpus for species entities, which is based on manually annotated abstracts. It comprises 800 PubMed abstracts that contain identified organism mentions. To increase the corpus taxonomic mention diversity the 800 abstracts were collected by selecting 100 abstracts from the following 8 categories: bacteriology, botany, entomology, medicine, mycology, protistology, virology and zoology. 800 has been annotated with a focus at the species level; however, higher taxa mentions (such as genera, families and orders) have also been considered.

5 papers1 benchmarksTexts

LINNAEUS

LINNAEUS is a general-purpose dictionary matching software, capable of processing multiple types of document formats in the biomedical domain (MEDLINE, PMC, BMC, OTMI, text, etc.). It can produce multiple types of output (XML, HTML, tab-separated-value file, or save to a database). It also contains methods for acting as a server (including load balancing across several servers), allowing clients to request matching over a network. A package with files for recognizing and identifying species names is available for LINNAEUS, showing 94% recall and 97% precision compared to LINNAEUS-species-corpus.

5 papers1 benchmarksTexts

MSR Action3D

Click to add a brief description of the dataset (Markdown and LaTeX enabled).

5 papers8 benchmarks

Jester (Jokes)

6.5 million anonymous ratings of jokes by users of the Jester Joke Recommender System.

5 papers0 benchmarksTexts

WDBC (Breast Cancer Wisconsin (Diagnostic))

Features are computed from a digitized image of a fine needle aspirate (FNA) of a breast mass. They describe characteristics of the cell nuclei present in the image. 569 samples, 30 features.

5 papers0 benchmarks

Thyroid (Thyroid Disease)

Thyroid is a dataset for detection of thyroid diseases, in which patients diagnosed with hypothyroid or subnormal are anomalies against normal patients. It contains 2800 training data instance and 972 test instances, with 29 or so attributes.

5 papers3 benchmarksImages, Medical
PreviousPage 208 of 1000Next