TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Datasets

19,997 machine learning datasets

Filter by Modality

  • Images3,275
  • Texts3,148
  • Videos1,019
  • Audio486
  • Medical395
  • 3D383
  • Time series298
  • Graphs285
  • Tabular271
  • Speech199
  • RGB-D192
  • Environment148
  • Point cloud135
  • Biomedical123
  • LiDAR95
  • RGB Video87
  • Tracking78
  • Biology71
  • Actions68
  • 3d meshes65
  • Tables52
  • Music48
  • EEG45
  • Hyperspectral images45
  • Stereo44
  • MRI39
  • Physics32
  • Interactive29
  • Dialog25
  • Midi22
  • 6D17
  • Replay data11
  • Financial10
  • Ranking10
  • Cad9
  • fMRI7
  • Parallel6
  • Lyrics2
  • PSG2

19,997 dataset results

FineDance

Click to add a brief description of the dataset (Markdown and LaTeX enabled).

18 papers8 benchmarks3D, Music

RepoEval

RepoEval is a benchmark specifically designed for evaluating repository-level code auto-completion systems. While existing benchmarks mainly focus on single-file tasks, RepoEval addresses the assessment gap for more complex, real-world, multi-file programming scenarios. Here are the key details about RepoEval:

18 papers0 benchmarks

PU1K

PU1K is nearly 8 times larger than the largest publicly available dataset collected by PU-GAN. PU1K consists of 1,147 3D models split into 1020 training samples and 127 testing samples. The training set contains 120 3D models compiled from PU-GAN’s dataset, in addition to 900 different models collected from ShapeNetCore. The testing set contains 27 models from PU-GAN and 100 more models from ShapeNetCore.

18 papers0 benchmarks3D

Elliptic Dataset

Click to add a brief description of the dataset (Markdown and LaTeX enabled).

18 papers4 benchmarksGraphs

Wild-Places

Many existing datasets for lidar place recognition are solely representative of structured urban environments, and have recently been saturated in performance by deep learning based approaches. Natural and unstructured environments present many additional challenges for the tasks of long-term localisation but these environments are not represented in currently available datasets. To address this we introduce Wild-Places, a challenging large-scale dataset for lidar place recognition in unstructured, natural environments. Wild-Places contains eight lidar sequences collected with a handheld sensor payload over the course of fourteen months, containing a total of 63K undistorted lidar submaps along with accurate 6DoF ground truth. This dataset contains multiple revisits both within and between sequences, allowing for both intra-sequence (i.e., loop closure detection) and inter-sequence (i.e., re-localisation) tasks. We also benchmark several state-of-the-art approaches to demonstrate t

18 papers2 benchmarks3D, LiDAR

SCAND (Socially CompliAnt Navigation Dataset)

Have you wondered how autonomous mobile robots should share space with humans in public spaces? Are you interested in developing autonomous mobile robots that can navigate within human crowds in a socially compliant manner? Do you want to analyze human reactions and behaviors in the presence of mobile robots of different morphologies?

18 papers0 benchmarksActions, LiDAR, Point cloud, RGB Video, RGB-D, Videos

CoIR (Code Information Retrieval Benchmark)

CoIR (Code Information Retrieval) benchmark, is designed to evaluate code retrieval capabilities. CoIR includes 10 curated code datasets, covering 8 retrieval tasks across 7 domains. In total, it encompasses two million documents. It also provides a common and easy Python framework, installable via pip, and shares the same data schema as benchmarks like MTEB and BEIR for easy cross-benchmark evaluations.

18 papers1 benchmarks

PMD

We propose a large-scale benchmark here, which contains a total of 6,461 mirror images with ground truth annotations.

18 papers6 benchmarks

TURINGBENCH

TuringBench is a benchmark environment that contains :

18 papers0 benchmarksTexts

ReVOS

We create a benchmark dataset named ReVOS. This dataset comprises 35,074 pairs of instruction-mask sequences derived from 1,042 diverse videos. In contrast to traditional referring video segmentation datasets, such as Ref-YouTube-VOS and MeViS, which primarily contain explicit short phrases, ReVOS includes text instructions that necessitates a sophisticated understanding of both video content and general world knowledge

18 papers8 benchmarksVideos

MuirBench

MuirBench is a benchmark containing 11,264 images and 2,600 multiple-choice questions, providing robust evaluation on 12 multi-image understanding tasks.

18 papers0 benchmarks

DroneVehicle (VisDrone-DroneVehicle)

The DroneVehicle dataset consists of a total of 56,878 images collected by the drone, half of which are RGB images, and the resting are infrared images. We have made rich annotations with oriented bounding boxes for the five categories. Among them, car has 389,779 annotations in RGB images, and 428,086 annotations in infrared images, truck has 22,123 annotations in RGB images, and 25,960 annotations in infrared images, bus has 15,333 annotations in RGB images, and 16,590 annotations in infrared images, van has 11,935 annotations in RGB images, and 12,708 annotations in infrared images, and freight car has 13,400 annotations in RGB images, and 17,173 annotations in infrared image. This dataset is available on the download page.

18 papers3 benchmarks

SALAD-Bench (A Hierarchical and Comprehensive Safety Benchmark for Large Language Models)

In the rapidly evolving landscape of Large Language Models (LLMs), ensuring robust safety measures is paramount. To meet this crucial need, we propose \emph{SALAD-Bench}, a safety benchmark specifically designed for evaluating LLMs, attack, and defense methods. Distinguished by its breadth, SALAD-Bench transcends conventional benchmarks through its large scale, rich diversity, intricate taxonomy spanning three levels, and versatile this http URL-Bench is crafted with a meticulous array of questions, from standard queries to complex ones enriched with attack, defense modifications and multiple-choice. To effectively manage the inherent complexity, we introduce an innovative evaluators: the LLM-based MD-Judge for QA pairs with a particular focus on attack-enhanced queries, ensuring a seamless, and reliable evaluation. Above components extend SALAD-Bench from standard LLM safety evaluation to both LLM attack and defense methods evaluation, ensuring the joint-purpose utility. Our extensive

18 papers0 benchmarksTexts

TID2013

TID2013 is a dataset for image quality assessment that contains 25 reference images and 3000 distorted images (25 reference images x 24 types of distortions x 5 levels of distortions).

17 papers4 benchmarksImages

ApolloCar3D

ApolloCar3DT is a dataset that contains 5,277 driving images and over 60K car instances, where each car is fitted with an industry-grade 3D CAD model with absolute model size and semantically labelled keypoints. This dataset is above 20 times larger than PASCAL3D+ and KITTI, the current state-of-the-art.

17 papers14 benchmarksImages

UCF101-24

Click to add a brief description of the dataset (Markdown and LaTeX enabled).

17 papers10 benchmarks

Kumar

The Kumar dataset contains 30 1,000×1,000 image tiles from seven organs (6 breast, 6 liver, 6 kidney, 6 prostate, 2 bladder, 2 colon and 2 stomach) of The Cancer Genome Atlas (TCGA) database acquired at 40× magnification. Within each image, the boundary of each nucleus is fully annotated.

17 papers4 benchmarksImages, Interactive

DIOR

Click to add a brief description of the dataset (Markdown and LaTeX enabled).

17 papers12 benchmarks

New College

The New College Data is a freely available dataset collected from a robot completing several loops outdoors around the New College campus in Oxford. The data includes odometry, laser scan, and visual information. The dataset URL is not working anymore.

17 papers0 benchmarksImages

SceneNet

SceneNet is a dataset of labelled synthetic indoor scenes. There are several labeled indoor scenes, including:

17 papers0 benchmarks3D, Images
PreviousPage 112 of 1000Next