Datasets

19,997 machine learning datasets

19,997 dataset results

BANKEX

Contains stock market closing prices of ten financial institutions. Closing Price in Indian Rupee (INR). Daily samples retrieved between 12 July 2005 and 3 November 2017. All time series with 3 032 samples.

2 papers0 benchmarks

Activities (Activities dataset)

Contains ten synthetic time series with five days of high activity and two days of low activity. Each series has 3584 samples.

2 papers0 benchmarks

FDH (Flickr Diverse Humans)

The Flickr Diverse Humans (FDH) dataset consists of 1.53M images of human figures from the YFCC100M dataset. Each image is annotated with keypoints, pixel-to-vertex correspondences (from CSE ) and a segmentation mask.

2 papers0 benchmarksImages

DyML-Vehicle (Dynamic Metric Learning Vehicle)

DyML-Vehicle merges two vehicle re-ID datasets PKU VehicleID [1], VERI-Wild [1]. Since these two datasets have only annotations on the identity (fine) level, we manually annotate each image with “model” label (e.g., Toyota Camry, Honda Accord, Audi A4) and “body type” label (e.g., car, suv, microbus, pickup). Moreover, we label all the taxi images as a novel testing class under coarse level.

2 papers1 benchmarksImages

DyML-Animal (Dynamic Metric Learning Animal)

DyML-Animal is based on animal images selected from ImageNet-5K [1]. It has 5 semantic scales (i.e., classes, order, family, genus, species) according to biological taxonomy. Specifically, there are 611 “species” for the fine level, 47 categories corresponding to “order”, “family” or “genus” for the middle level, and 5 “classes” for the coarse level. We note some animals have contradiction between visual perception and biological taxonomy, e.g., whale in “mammal” actually looks more similar to fish. Annotating the whale images as belonging to mammal would cause confusion to visual recognition. So we take a detailed check on potential contradictions and intentionally leave out those animals.

2 papers1 benchmarksImages

DyML-Product (Dynamic Metric Learning Product)

DyML-Product is derived from iMaterialist-2019, a hierarchical online product dataset. The original iMaterialist-2019 offers up to 4 levels of hierarchical annotations. We remove the coarsest level and maintain 3 levels for DyML-Product.

2 papers1 benchmarksImages

RoMQA

RoMQA is a benchmark for robust, multi-evidence, and multi-answer question answering (QA). RoMQA contains clusters of questions that are derived from related constraints mined from the Wikidata knowledge graph. The dataset evaluates robustness of QA models to varying constraints by measuring worst-case performance within each question cluster.

2 papers0 benchmarksTexts

STIR (Scaled and Translated Image Recognition)

While convolutions are known to be invariant to (discrete) translations, scaling continues to be a challenge and most image recognition networks are not invariant to them. To explore these effects, we have created the Scaled and Translated Image Recognition (STIR) dataset. This dataset contains objects of size $s \in [17, 64]$, each randomly placed in a $64 \times 64$ pixel image.

2 papers0 benchmarksImages

Lyra Dataset (A Dataset for Greek Traditional and Folk Music)

Lyra is a dataset of 1570 traditional and folk Greek music pieces that includes audio and video (timestamps and links to YouTube videos), along with annotations that describe aspects of particular interest for this dataset, including instrumentation, geographic information and labels of genre and subgenre, among others.

2 papers0 benchmarksAudio, Music, Videos

LEAFTOP

Nouns extracted automatically from Bible translations across 1580 languages.

2 papers0 benchmarksTexts

ReplicaGrasp

ReplicaGrasp dataset is created by spawning objects from GRAB into the ReplicaCAD scenes, simulated in random positions and orientations using the Habitat simulator. We capture 4,800 instances, with 50 different objects spawned in one of 48 receptacles in both, upright and randomly fallen orientations.

2 papers0 benchmarks3D

ESB (End-to-End Speech Benchmark)

ESB is a benchmark for evaluating the performance of a single automatic speech recognition (ASR) system across a broad set of speech datasets. It comprises eight English speech recognition datasets, capturing a broad range of domains, acoustic conditions, speaker styles, and transcription requirements.

2 papers0 benchmarksSpeech

CUP

CUP (Context-sitUated Pun) is a dataset containing 4.5k tuples of context words and pun pairs, each labelled with whether they are compatible for composing a pun.

2 papers0 benchmarksTexts

LEPISZCZE

LEPISZCZE is an open-source comprehensive benchmark for Polish NLP and a continuous-submission leaderboard, concentrating public Polish datasets (existing and new) in specific tasks.

2 papers0 benchmarksTexts

Retina Benchmark

The Retina Benchmark is a set of real-world tasks that accurately reflect such complexities and are designed to assess the reliability of predictive models in safety-critical scenarios. Specifically, two publicly available datasets of high-resolution human retina images exhibiting varying degrees of diabetic retinopathy, a medical condition that can lead to blindness, are used to design a suite of automated diagnosis tasks that require reliable predictive uncertainty quantification.

2 papers0 benchmarksImages

PcMSP

PcMSP is a dataset annotated from 305 open access scientific articles for material science information extraction that simultaneously contains the synthesis sentences extracted from the experimental paragraphs, as well as the entity mentions and intra-sentence relations.

2 papers0 benchmarksTexts

HERDPhobia

HERDPhobia is an annotated hate speech detection dataset on Fulani herders in Nigeria -- in three languages: English, Nigerian-Pidgin, and Hausa.

2 papers0 benchmarksTexts

MIAD

MIAD contains more than 100K high-resolution color images in various outdoor industrial scenarios, designed for unsupervised anomaly detection. This dataset is generated by a 3D graphics software and covers both surface and logical anomalies with pixel-precise ground truth.

2 papers0 benchmarksImages

MCSCSet

MCSCSet is a large-scale specialist-annotated dataset, designed for the task of Medical-domain Chinese Spelling Correction that contains about 200k samples. MCSCSet involves: i) extensive real-world medical queries collected from Tencent Yidian, ii) corresponding misspelled sentences manually annotated by medical specialists.

2 papers0 benchmarksMedical, Texts

HiAML

HiAML Computational Graph (CG) family introduced in "GENNAPE: Towards Generalized Neural Architecture Performance Estimators", accepted to AAAI-23. Contains 4.6k CIFAR-10 networks with an accuracy range of [91.11%, 93.44%].

2 papers0 benchmarksGraphs

PreviousPage 332 of 1000Next