TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Datasets

19,997 machine learning datasets

Filter by Modality

  • Images3,275
  • Texts3,148
  • Videos1,019
  • Audio486
  • Medical395
  • 3D383
  • Time series298
  • Graphs285
  • Tabular271
  • Speech199
  • RGB-D192
  • Environment148
  • Point cloud135
  • Biomedical123
  • LiDAR95
  • RGB Video87
  • Tracking78
  • Biology71
  • Actions68
  • 3d meshes65
  • Tables52
  • Music48
  • EEG45
  • Hyperspectral images45
  • Stereo44
  • MRI39
  • Physics32
  • Interactive29
  • Dialog25
  • Midi22
  • 6D17
  • Replay data11
  • Financial10
  • Ranking10
  • Cad9
  • fMRI7
  • Parallel6
  • Lyrics2
  • PSG2

19,997 dataset results

LOFAR RFI Detection (Low-Frequency Array (LOFAR) Radio Frequency Interference Detection)

This dataset contains simulated and expert-labelled spectrograms from two radio telescopes: the Hydrogen Epoch of Reionization Array (HERA) in South Africa and the Low-Frequency Array (LOFAR) in the Netherlands. These datasets are intended to test radio-frequency interference (RFI) detection schemes. This entry pertains to the LOFAR dataset specifically.

2 papers6 benchmarksImages

SR-Reg (SynthRAD Registration)

SR-Reg is a brain MR-CT registration dataset, deriving from SynthRAD 2023 (https://synthrad2023.grand-challenge.org/). This dataset contains 180 subjects preprocessed images, and each subject comprises a brain MR image and a brain CT image with corresponding segmentation label. SR-Reg is first introduced in MambaMorph (https://arxiv.org/abs/2401.13934).

2 papers1 benchmarksImages, MRI

address_parser_data

This is a set of datasets containing three versions of data:

2 papers0 benchmarks

RedEval

RedEval is a safety evaluation benchmark designed to assess the robustness of large language models (LLMs) against harmful prompts. It simulates and evaluates LLM applications across various scenarios, all while eliminating the need for human intervention. Here are the key aspects of RedEval:

2 papers0 benchmarks

BNCI2014-004 MOABB (BNCI 2014-004 Motor Imagery dataset.)

2 papers12 benchmarks

BNCI2015-001 MOABB (BNCI 2015-001 Motor Imagery dataset.)

2 papers12 benchmarks

nEMO

Overview nEMO is a simulated dataset of emotional speech in the Polish language. The corpus contains over 3 hours of samples recorded with the participation of nine actors portraying six emotional states: anger, fear, happiness, sadness, surprise, and a neutral state. The text material used was carefully selected to represent the phonetics of the Polish language. The corpus is available for free under the Creative Commons license (CC BY-NC-SA 4.0).

2 papers0 benchmarksAudio, Speech

kickstarter (Funding Successful Projects on Kickstarter)

Kickstarter is a community of more than 10 million people comprising of creative, tech enthusiasts who help in bringing creative project to life. Till now, more than $3 billion dollars have been contributed by the members in fueling creative projects. The projects can be literally anything – a device, a game, an app, a film etc.

2 papers1 benchmarksTabular, Texts

DSEval-Kaggle

In this paper, we introduce a novel benchmarking framework designed specifically for evaluations of data science agents. Our contributions are three-fold. First, we propose DSEval, an evaluation paradigm that enlarges the evaluation scope to the full lifecycle of LLM-based data science agents. We also cover aspects including but not limited to the quality of the derived analytical solutions or machine learning models, as well as potential side effects such as unintentional changes to the original data. Second, we incorporate a novel bootstrapped annotation process letting LLM themselves generate and annotate the benchmarks with ``human in the loop''. A novel language (i.e., DSEAL) has been proposed and the derived four benchmarks have significantly improved the benchmark scalability and coverage, with largely reduced human labor. Third, based on DSEval and the four benchmarks, we conduct a comprehensive evaluation of various data science agents from different aspects. Our findings re

2 papers0 benchmarks

ArSen-20

Sentiment detection remains a pivotal task in natural language processing, yet its development in Arabic lags due to a scarcity of training materials compared to English. Addressing this gap, we present ArSen-20, a benchmark dataset tailored to propel Arabic sentiment detection forward. ArSen-20 comprises 20,000 professionally labeled tweets sourced from Twitter, focusing on the theme of COVID-19 and spanning the period from 2020 to 2023. Beyond tweet content, the dataset incorporates metadata associated with the user, enriching the contextual understanding. ArSen-20 offers a comprehensive resource to foster advancements in Arabic sentiment analysis and facilitate research in this critical domain.

2 papers0 benchmarksTexts

Integrable Expressions

A curated dataset using the methodology of the paper is available in the Dataset folder. The name of the JSON for each file corresponds to the elementary extension used to generate the (integrand, integral) pair. Each JSON contains 10,000 examples. These can be directly read as Maple expressions using the parse command in Maple or to Python SymPy Expressions using the sympify command.

2 papers0 benchmarks

SEPE 8K

SEPE 8K dataset is made of 40 different 8K (8192 x 4320) video sequences and 40 variant 8K (8192 x 5464) images. The video sequences were captured at a framerate of 29.97 frames per second (FPS) and had been encoded into videos using AVC/H.264, HEVC/H.265, and AV1 codecs at resolutions from 8K to 480p. The images, video sequences, encoded videos, and various other statistics related to the media that make the dataset are stored online, published, and maintained on the repo on GitHub for non-commercial use. this proposed dataset is - as far as we know - the first to publish true 8K natural sequences; thus, it is important for the next level of applications dealing with multimedia such as video quality assessment, super-resolution, video coding, video compression, and many more.

2 papers1 benchmarksImages, Videos

MedSecId

The process by which sections in a document are demarcated and labeled is known as section identification. Such sections are helpful to the reader when searching for information and contextualizing specific topics. The goal of this work is to segment the sections of clinical medical domain documentation. The primary contribution of this work is MedSecId, a publicly available set of 2,002 fully annotated medical notes from the MIMIC-III. We include several baselines, source code, a pretrained model and analysis of the data showing a relationship between medical concepts across sections using principal component analysis.

2 papers1 benchmarksTexts

Ultrasound Nerve Segmentation

Identify nerve structures in ultrasound images of the neck

2 papers0 benchmarks

PoliteRewrite (the politerewrite dataset)

https://huggingface.co/datasets/jdustinwind/Polite

2 papers0 benchmarksTexts

Audio-alpaca

Audio-alpaca: A preference dataset for aligning text-to-audio models Audio-alpaca is a pairwise preference dataset containing about 15k (prompt,chosen, rejected) triplets where given a textual prompt, chosen is the preferred generated audio and rejected is the undesirable audio.

2 papers0 benchmarksAudio, Texts

MS-BioGraphs (MS-BioGraphs: Sequence SImilarity Graph Datasets)

https://doi.org/10.21227/gmd9-1534

2 papers0 benchmarks

XQuAD-IN

Given a question and passage in an Indic language, generate a short answer span from the passage as the answer.

2 papers0 benchmarksTexts

Flores-IN

Given a sentence in the source language, generate a translation in the target language. The data contains translation pairs in two directions, English → TargetLanguage and TargetLanguage → English

2 papers0 benchmarksTexts

CrossSum-IN

Given an English article, generate a short summary in the target language.

2 papers0 benchmarksTexts
PreviousPage 347 of 1000Next