TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Datasets

19,997 machine learning datasets

Filter by Modality

  • Images3,275
  • Texts3,148
  • Videos1,019
  • Audio486
  • Medical395
  • 3D383
  • Time series298
  • Graphs285
  • Tabular271
  • Speech199
  • RGB-D192
  • Environment148
  • Point cloud135
  • Biomedical123
  • LiDAR95
  • RGB Video87
  • Tracking78
  • Biology71
  • Actions68
  • 3d meshes65
  • Tables52
  • Music48
  • EEG45
  • Hyperspectral images45
  • Stereo44
  • MRI39
  • Physics32
  • Interactive29
  • Dialog25
  • Midi22
  • 6D17
  • Replay data11
  • Financial10
  • Ranking10
  • Cad9
  • fMRI7
  • Parallel6
  • Lyrics2
  • PSG2

19,997 dataset results

UIT-ViCTSD (UIT Vietnamese Constructive and Toxic Speech Detection)

UIT-ViCTSD (Vietnamese Constructive and Toxic Speech Detection) is a dataset for constructive and toxic speech detection in Vietnamese. It consists of 10,000 human-annotated comments.

7 papers0 benchmarksTexts

TbD-3D

7 papers33 benchmarks

EasyCall

EasyCall is a new dysarthric speech command dataset in Italian. The dataset consists of 21386 audio recordings from 24 healthy and 31 dysarthric speakers, whose individual degree of speech impairment was assessed by neurologists through the Therapy Outcome Measure. The corpus aims at providing a resource for the development of ASR-based assistive technologies for patients with dysarthria. In particular, it may be exploited to develop a voice-controlled contact application for commercial smartphones, aiming at improving dysarthric patients' ability to communicate with their family and caregivers. Before recording the dataset, participants were administered a survey to evaluate which commands are more likely to be employed by dysarthric individuals in a voice-controlled contact application. In addition, the dataset includes a list of non-commands (i.e., words near/inside commands or phonetically close to commands) that can be leveraged to build a more robust command recognition system.

7 papers0 benchmarksSpeech

DFUC2021 (Diabetic Foot Ulcers 2021)

The Diabetic Foot Ulcers dataset (DFUC2021) is a dataset for analysis of pathology, focusing on infection and ischaemia. The final release of DFUC2021 consists of 15,683 DFU patches, with 5,955 training, 5,734 for testing and 3,994 unlabeled DFU patches. The ground truth labels are four classes, i.e. control, infection, ischaemia and both conditions.

7 papers0 benchmarksImages, Medical

ORBIT

ORBIT is a real-world few-shot dataset and benchmark grounded in a real-world application of teachable object recognizers for people who are blind/low vision. The dataset contains 3,822 videos of 486 objects recorded by people who are blind/low-vision on their mobile phones, and the benchmark reflects a realistic, highly challenging recognition problem, providing a rich playground to drive research in robustness to few-shot, high-variation conditions.

7 papers0 benchmarksImages, Videos

KUMC

The KUMC dataset for polyp detection and classification was collected from the University of Kansas Medical Center. It contains 80 colonoscopy video sequences which are manually labeled with bounding boxes as well as the polyp classes for the entire dataset.

7 papers0 benchmarksImages, Medical, Videos

Inspec

Paper: Improved automatic keyword extraction given more linguistic knowledge Doi: 10.3115/1119355.1119383

7 papers4 benchmarks

LDV (Large-scale Diverse Video)

LDV is a dataset for video enhancement. It contains 240 videos with diverse categories of content, different kinds of motion and various frame-rates.

7 papers0 benchmarksVideos

TRANCOS (TRaffic ANd COngestionS)

7 papers3 benchmarks

POT-210 (Planar Object Tracking in the Wild: A Benchmark)

Planar object tracking is an actively studied problem in vision-based robotic applications. While several benchmarks have been constructed for evaluating state-of-theart algorithms, there is a lack of video sequences captured in the wild rather than in constrained laboratory environment. In this paper, we present a carefully designed planar object tracking benchmark containing 210 videos of 30 planar objects sampled in the natural environment. In particular, for each object, we shoot seven videos involving various challenging factors, namely scale change, rotation, perspective distortion, motion blur, occlusion, out-of-view, and unconstrained. The ground truth is carefully annotated semi-manually to ensure the quality. Moreover, eleven state-of-the-art algorithms are evaluated on the benchmark using two evaluation metrics, with detailed analysis provided for the evaluation results. We expect the proposed benchmark to benefit future studies on planar object tracking.

7 papers0 benchmarksVideos

MengeROS

MengeROS is an open-source crowd simulation tool for robot navigation that integrates Menge with ROS. It extends Menge to introduce one or more robot agents into a crowd of pedestrians. Each robot agent is controlled by external ROS-compatible controllers. MengeROS has been used to simulate crowds with up to 1000 pedestrians and 20 robots.

7 papers0 benchmarksEnvironment

DBLP Temporal

DBLP Temporal is a dataset for temporal entity resolution, based on author profiles extracted from the Digital Bibliography and Library Project (DBLP).

7 papers3 benchmarks

Backstabber’s Knife Collection

Backstabber’s Knife Collection is a dataset of 174 malicious software packages that were used in real-world attacks on open source software supply chains, and which were distributed via the popular package repositories npm, PyPI, and RubyGems. Those packages, dating from November 2015 to November 2019, were manually collected and analyzed.

7 papers0 benchmarks

Grasp Affordance

This is a dataset for visual grasp affordance prediction that promotes more robust and heterogeneous robotic grasping methods. The dataset contains different attributes from 30 different objects. Each object instance is related not only to the semantic descriptions, but also the physical features describing visual attributes, locations and different grasping regions related to a variety of actions.

7 papers0 benchmarks

ImageNet-9

ImageNet-9 consists of images with different amounts of background and foreground signal, which you can use to measure the extent to which your models rely on image backgrounds. This dataset is helpful in testing the robustness of vision models with respect to their dependence on the backgrounds of images.

7 papers1 benchmarksImages

VPCD (Video Person-Clustering)

VPCD contains multi-modal annotations (face, body and voice) for all primary and secondary characters from a range of diverse TV-shows and movies. It is used for evaluating multi-modal person-clustering. It contains body-tracks for each annotated character, face-tracks when visible, and voice-tracks when speaking, with their associated features.

7 papers1 benchmarksVideos

VidHOI

VidHOI is a video-based human-object interaction detection benchmark. VidHOI is based on VidOR which is densely annotated with all humans and predefined objects showing up in each frame. VidOR is also more challenging as the videos are non-volunteering user-generated and thus jittery at times.

7 papers9 benchmarksVideos

Rel3D

Understanding spatial relations (e.g., “laptop on table”) in visual input is important for both humans and robots. Existing datasets are insufficient as they lack largescale, high-quality 3D ground truth information, which is critical for learning spatial relations. In this paper, we fill this gap by constructing Rel3D: the first large-scale, human-annotated dataset for grounding spatial relations in 3D. Rel3D enables quantifying the effectiveness of 3D information in predicting spatial relations on large-scale human data. Moreover, we propose minimally contrastive data collection—a novel crowdsourcing method for reducing dataset bias. The 3D scenes in our dataset come in minimally contrastive pairs: two scenes in a pair are almost identical, but a spatial relation holds in one and fails in the other. We empirically validate that minimally contrastive examples can diagnose issues with current relation detection models as well as lead to sample-efficient training. Code and data are avai

7 papers1 benchmarks3D, Images

KUAKE-QTR (Query-Title Relevance Dataset)

KUAKE Query Title Relevance, a dataset used to estimate the relevance of the title of a query document, is used for the KUAKE-QTR task. Given a query (e.g., “Symptoms of vitamin B deficiency”), the task aims to find the relevant title (e.g., “The main manifestations of vitamin B deficiency”).

7 papers1 benchmarksTexts

KUAKE-QQR (Query-Query Relevance Dataset)

KUAKE Query-Query Relevance, a dataset used to evaluate the relevance of the content expressed in two queries, is used for the KUAKE-QQR task. Similar to KUAKE-QTR, the task aims to estimate query-query relevance, which is an essential and challenging task in real-world search engines.

7 papers1 benchmarksTexts
PreviousPage 184 of 1000Next