TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Datasets

19,997 machine learning datasets

Filter by Modality

  • Images3,275
  • Texts3,148
  • Videos1,019
  • Audio486
  • Medical395
  • 3D383
  • Time series298
  • Graphs285
  • Tabular271
  • Speech199
  • RGB-D192
  • Environment148
  • Point cloud135
  • Biomedical123
  • LiDAR95
  • RGB Video87
  • Tracking78
  • Biology71
  • Actions68
  • 3d meshes65
  • Tables52
  • Music48
  • EEG45
  • Hyperspectral images45
  • Stereo44
  • MRI39
  • Physics32
  • Interactive29
  • Dialog25
  • Midi22
  • 6D17
  • Replay data11
  • Financial10
  • Ranking10
  • Cad9
  • fMRI7
  • Parallel6
  • Lyrics2
  • PSG2

19,997 dataset results

AIOZ-GDANCE

AIOZ-GDANCE comprises 16.7 hours of whole-body motion and music audio of group dancing. The duration of each video in our dataset is ranging from 15 to 60 seconds.

7 papers28 benchmarks3D, Music

NIGHTS (Novel Image Generations with Human-Tested Similarity)

A dataset of human similarity judgments over image pairs that are alike in diverse ways. Critical to this dataset is that judgments are nearly automatic and shared by all observers.

7 papers0 benchmarksImages

robosuite Benchmark

A Modular Simulation Framework and Benchmark for Robot Learning.

7 papers0 benchmarksImages

AVisT (A Benchmark for Visual Object Tracking in Adverse Visibility)

One of the key factors behind the recent success in visual tracking is the availability of dedicated benchmarks. While being greatly benefiting to the tracking research, existing benchmarks do not pose the same difficulty as before with recent trackers achieving higher performance mainly due to (i) the introduction of more sophisticated transformers-based methods and (ii) the lack of diverse scenarios with adverse visibility such as, severe weather conditions, camouflage and imaging effects. We introduce AVisT, a dedicated benchmark for visual tracking in diverse scenarios with adverse visibility. AVisT comprises 120 challenging sequences with 80k annotated frames, spanning 18 diverse scenarios broadly grouped into five attributes with 42 object categories. The key contribution of AVisT is diverse and challenging scenarios covering severe weather conditions such as, dense fog, heavy rain and sandstorm; obstruction effects including, fire, sun glare and splashing water; adverse imaging

7 papers2 benchmarks

VOT2022

Click to add a brief description of the dataset (Markdown and LaTeX enabled).

7 papers2 benchmarks

SME (Standard Multimodal Explanation)

SME is a new dataset for Multi-modal Explanation for Visual Question Answering comprising 1,028,230 samples, with 1,656 visual objects requiring detection in explanations. To our knowledge, this is the first dataset where the explanations are in standard English with additional visual grounding tokens.

7 papers24 benchmarksImages, Texts

L+M-24

Language-molecule models have emerged as an exciting direction for molecular discovery and understanding. However, training these models is challenging due to the scarcity of molecule-language pair datasets. At this point, datasets have been released which are 1) small and scraped from existing databases, 2) large but noisy and constructed by performing entity linking on the scientific literature, and 3) built by converting property prediction datasets to natural language using templates. In this document, we detail the L+M-24 dataset, which has been created for the Language + Molecules Workshop shared task at ACL 2024. In particular, L+M-24 is designed to focus on three key benefits of natural language in molecule design: compositionality, functionality, and abstraction

7 papers6 benchmarksBiomedical, Graphs, Texts

MIPE (Improving Paratope and Epitope Prediction by Multi-Modal Contrastive Learning and Interaction Informativeness Estimation)

Datasets. From the publicly accessible Structural Antibody Database (SAbDab), we collected a total of 7571 antibodyantigen complexes, with the sequence data in FASTA format and structural data in PDB format. Following previous studies [Pittala and Bailey-Kellogg, 2020], we used CD-HIT [Li and Godzik, 2006] to remove high-homology antibody and antigen sequences with the thresholds of 95% and 90% sequence identity, respectively. Subsequently, we excluded antibodies and antigens with any residue type rather than 20 naturally occurring types. Finally, we compiled a dataset consisting of 626 binding antibody-antigen pairs, including their sequences, structures, and corresponding interaction maps. Noteworthy, antibodies primarily bind to antigens through their CDR regions. Most researchers use Euclidean distance to define paratopes and epitopes, and we follow the usual way in our dataset: within the CDR regions/antigen, a residue is labeled as a paratope/epitope if the Euclidean distance bet

7 papers4 benchmarksBiology, Biomedical

PrideMM

PrideMM comprising 5,063 text-embedded images associated with the LGBTQ+ Pride movement

7 papers2 benchmarks

UPAR (Unified Pedestrian Attribute Recognition)

The Task: The challenge will use an extension of the UPAR Dataset [1], which consists of images of pedestrians annotated for 40 binary attributes. For deployment and long-term use of machine-learning algorithms in a surveillance context, the algorithms must be robust to domain gaps that occur when the environment changes. This challenge aims to spotlight the problem of domain gaps in a real-world surveillance context and highlight the challenges and limitations of existing methods to provide a direction for future research.

7 papers4 benchmarksImages, Texts

PoseBusters Benchmark (PoseBusters Benchmark Set)

These are the protein-ligand complexes of the PoseBusters Benchmark set as described in the paper "PoseBusters: AI-based docking methods fail to generate physically valid poses or generalise to novel sequences" [1] with associated code at https://github.com/maabuu/posebusters

7 papers0 benchmarks

iPhone (Monocular Dynamic View Synthesis)

iPhone dataset is a challenging benchmarks for dynamic reconstruction. This dataset consists of a collection of videos with realistic scenes and large object motions captured with a hand-held iPhone. The evaluation measures rendering quality on novel viewpoints which have low overlaps with the training camera views. This datasets do not have the (a) teleporting camera motion or (b) quasi-static scene motion issues as the previous ones.

7 papers1 benchmarksImages, Videos

Long-RVOS

This work proposes Long-RVOS, a large-scale benchmark for long-term video object segmentation. Long-RVOS is the first minute-level dataset in the RVOS field, designed to tackle various realistic long-video challenges such as frequent occlusion, disappearance-reappearance, and shot changing. Notably, Long-RVOS offers significantly longer video duration than existing datasets. In addition, it contains the largest number of object classes and mask annotations. The large scale of Long-RVOS supports comprehensive training and evaluation of RVOS models. Finally, we gather 24,689 high-quality descriptions for building Long-RVOS.

7 papers6 benchmarksTexts, Videos

Spider2-V

A multimodal agent benchmark on professional data science and engineering. * 494 real-world tasks, ranging from data warehousing to orchestration; * 20 professional enterprise-level applications (e.g., BigQuery, dbt, Airbyte, etc.); * both command line (CLI) and graphical user interfaces (GUI); * an interactive executable computer environment; * a document warehouse for agent retrieval.

7 papers0 benchmarksEnvironment, Images, Interactive, Texts

k-qa (K-QA: A Real-World Medical Q&A Benchmark)

Click to add a brief description of the dataset (Markdown and LaTeX enabled).

7 papers0 benchmarksTexts

Politi Hop

ataset format Each row in the dataset splits represents one instance and contains the following tab-separated columns:

7 papers0 benchmarks

LIVE (Laboratory for Image & Video Engineering)

Briefly describe the dataset. Provide:

6 papers4 benchmarks

CHB-MIT (CHB-MIT Scalp EEG)

The CHB-MIT dataset is a dataset of EEG recordings from pediatric subjects with intractable seizures. Subjects were monitored for up to several days following withdrawal of anti-seizure mediation in order to characterize their seizures and assess their candidacy for surgical intervention. The dataset contains 23 patients divided among 24 cases (a patient has 2 recordings, 1.5 years apart). The dataset consists of 969 Hours of scalp EEG recordings with 173 seizures. There exist various types of seizures in the dataset (clonic, atonic, tonic). The diversity of patients (Male, Female, 10-22 years old) and different types of seizures contained in the datasets are ideal for assessing the performance of automatic seizure detection methods in realistic settings.

6 papers1 benchmarksAudio, EEG, Medical

Freiburg Forest

The Freiburg Forest dataset was collected using a Viona autonomous mobile robot platform equipped with cameras for capturing multi-spectral and multi-modal images. The dataset may be used for evaluation of different perception algorithms for segmentation, detection, classification, etc. All scenes were recorded at 20 Hz with a camera resolution of 1024x768 pixels. The data was collected on three different days to have enough variability in lighting conditions as shadows and sun angles play a crucial role in the quality of acquired images. The robot traversed about 4.7 km each day. The dataset creators provide manually annotated pixel-wise ground truth segmentation masks for 6 classes: Obstacle, Trail, Sky, Grass, Vegetation, and Void.

6 papers2 benchmarksImages, RGB-D

AQUAINT

The AQUAINT Corpus consists of newswire text data in English, drawn from three sources: the Xinhua News Service (People's Republic of China), the New York Times News Service, and the Associated Press Worldstream News Service. It was prepared by the LDC for the AQUAINT Project, and will be used in official benchmark evaluations conducted by National Institute of Standards and Technology (NIST).

6 papers1 benchmarksTexts
PreviousPage 192 of 1000Next