TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Datasets

52 machine learning datasets

Filter by Modality

  • Images3,275
  • Texts3,148
  • Videos1,019
  • Audio486
  • Medical395
  • 3D383
  • Time series298
  • Graphs285
  • Tabular271
  • Speech199
  • RGB-D192
  • Environment148
  • Point cloud135
  • Biomedical123
  • LiDAR95
  • RGB Video87
  • Tracking78
  • Biology71
  • Actions68
  • 3d meshes65
  • Tables52
  • Music48
  • EEG45
  • Hyperspectral images45
  • Stereo44
  • MRI39
  • Physics32
  • Interactive29
  • Dialog25
  • Midi22
  • 6D17
  • Replay data11
  • Financial10
  • Ranking10
  • Cad9
  • fMRI7
  • Parallel6
  • Lyrics2
  • PSG2
Clear filter

52 dataset results

Dataset of a Study of Computational reproducibility of Jupyter notebooks from biomedical publications version 1 (Version 1)

This repository contains the dataset for the study of the computational reproducibility of Jupyter notebooks from biomedical publications. We analyzed the reproducibility of Jupyter notebooks from GitHub repositories associated with publications indexed in the biomedical literature repository PubMed Central. The dataset includes the metadata information of the journals, publications, the Github repositories mentioned in the publications and the notebooks present in the Github repositories.

1 papers0 benchmarksImages, Tables, Tabular

metabench - Paper Data

Item-wise accuracies in six benchmarks from Open LLM Leaderboard 1 scraped from huggingface.co and used for metabench analyses and construction. Datasets with RMSE's for random benchmark subsets are used as reference in the paper and are included here.

1 papers0 benchmarksTables

SemTabNet

Dataset Card for SemTabNet This dataset accompanies the following paper:

1 papers1 benchmarksTables, Tabular, Texts

RClicks

We conducted a large crowdsourcing study of click patterns in an interactive segmentation scenario and collected 475K real-user clicks. Drawing on ideas from saliency tasks, we develop a clickability model that enables sampling clicks, which closely resemble actual user inputs. Using our model and dataset, we propose RClicks benchmark for a comprehensive comparison of existing interactive segmentation methods on realistic clicks. Specifically, we evaluate not only the average quality of methods, but also the robustness w.r.t. click patterns.

1 papers0 benchmarksActions, Images, Interactive, Tables, Tabular

Perfume Co-Preference Network

The Perfume Co-Preference Network dataset comprises comprehensive user reviews and ratings collected from the Persian retail platform Atrafshan. This dataset, central to our research on community detection in fragrance preferences, includes 36,434 comments from 7,387 unique users, providing insights into consumer sentiment towards various perfumes. It is designed to facilitate the analysis of user preferences through sentiment analysis, allowing for the clustering of perfumes based on shared attributes.

1 papers0 benchmarksGraphs, Tables, Texts

Twitter job title prediction

We introduce a dataset consisting of 1314 samples, including users’ tweets and bios. The user’s job title is found using Wikipedia crawling. The challenge of multiple job titles per user is handled using a semantic word embedding and clustering method. Then, a job prediction method is introduced based on a deep neural network and TF-IDF word embedding. We also use hashtags and emojis in the tweets for job prediction. Results show that the job title of users in Twitter could be well predicted with 54% accuracy in nine categories.

1 papers0 benchmarksTables, Tabular, Texts

SCG (SCG Dataset from Graph Neural Networks in Supply Chain Analytics and Optimization: Concepts, Perspectives, Dataset & Benchmarks)

Abstract: Graph Neural Networks (GNNs) have recently gained traction in transportation, bioinformatics, language and image processing, but research on their application to supply chain management remains limited. Supply chains are inherently graph-like, making them ideal for GNN methodologies, which can optimize and solve complex problems. The barriers include a lack of proper conceptual foundations, familiarity with graph applications in SCM, and real-world benchmark datasets for GNN-based supply chain research. To address this, we discuss and connect supply chains with graph structures for effective GNN application, providing detailed formulations, examples, mathematical definitions, and task guidelines. Additionally, we present a multi-perspective real-world benchmark dataset from a leading FMCG company in Bangladesh, focusing on supply chain planning. We discuss various supply chain tasks using GNNs and benchmark several state-of-the-art models on homogeneous and heterogeneous grap

1 papers1 benchmarksGraphs, Tables

VREM-FL datasets

This dataset collection includes three files used for the experiments. Each file contains 6 columns: {timestep, vehicle ID, x coordinate in the map, y coordinate in the map, real bitrate, estimated bitrate}. The datasets, obtained from REMs with Gaussian estimation and real (https://ieee-dataport.org/open-access/crawdad-romataxi) or simulated (https://eclipse.dev/sumo/) vehicular mobility, are used in the original paper for optimizing the task of federated learning (client scheduling and resource allocation).

1 papers0 benchmarksTables, Time series

ASRD (Anime Style Recognition Dataset)

A well-labeled challenging dataset, to facilitate the research on style recognition on anime images by collecting images from 190 anime and cartoon works covering 93 years from 13 countries and regions, 2D and 3D work into consideration concurrently. We choose at most ten roles for each work. All the images are obtained from the Internet. The images in the LSASRD dataset are mainly from existing anime and cartoons. Moreover, some are from comics or games of the same anime series. Unlike illustration or video datasets, we provide a moderate amount of contextual information in a wide variety of styles. LSASRD requires the ability of context understanding of image models.

1 papers0 benchmarksImages, Tables

MERGE SPCS

This dataset contains pre-processed versions of datasets introduced in prior works. Additionally, it also contains new data that are pertinent to the paper.

1 papers0 benchmarksBiology, Biomedical, Images, Medical, Tables, Tabular

Data Storage System Performance

IOPS and Latency measurements of a real data storage system

1 papers0 benchmarksTables, Tabular

[[Talk!!Person]]How do I get a person on Expedia?

How to contact Expedia by phone? You can contact Expedia by phone at +1-805-330-4056 if you're in Mexico, or +1-888-829-0881 if you're in the United States. Both numbers are available 24 hours a day and offer support in Spanish to help you with any travel-related questions.

1 papers0 benchmarksTables
PreviousPage 3 of 3