Datasets

285 machine learning datasets

285 dataset results

MuMiN-small

This is the small version of the MuMiN dataset.

1 papers2 benchmarksGraphs, Images, Texts

MuMiN-medium

This is the medium version of the MuMiN dataset.

1 papers2 benchmarksGraphs, Images, Texts

MuMiN-large

This is the large version of the MuMiN dataset.

1 papers2 benchmarksGraphs, Images, Texts

A small dataset from the Inductive Link Prediction Challenge 2022. Training graph contains 10K entities, 96 relations, 78K triples. Inference graph contains 7K entities, 96 relations, 21K triples. Validation and test triples to predict belong to the inference graph.

1 papers7 benchmarksGraphs

ILPC22-Large

A large dataset from the Inductive Link Prediction Challenge 2022. Training graph contains 46K entities, 130 relations, 202K triples. Inference graph contains 30K entities, 130 relations, 77K triples. Validation and test triples to predict belong to the inference graph.

1 papers7 benchmarksGraphs

RoomEnv-v0 (The Room environment - v0)

The Room environment - v0

1 papers1 benchmarksGraphs, Texts

OUMVLP-Pose (Multi-View Large Population Database with Pose Sequence)

The OU-ISIR Gait Database, Multi-View Large Population Database with Pose Sequence (OUMVLP-Pose) is meant to aid research efforts in the general area of developing, testing and evaluating algorithms for model-based gait recognition.

1 papers0 benchmarksGraphs

Identity Access Management dataset

We release 280 synthetic IAM graphs generated using IAM graphs of commercial companies. Specifically, we vary the number of nodes, but keep graph density as is, i.e. in the range of 0.259 ± 0.198 (avg ± std). To generate a synthetic graph, we first sample the number of users and datastores from uniform distributions over the following intervals [10, 150] and [50, 300] respectively that cover variations of those parameters across real graphs. After fixing node counts we sample with replacement the actual nodes from a real world graph, which is chosen at random. Then we add Gaussian N(0, 0.01) noise to node embeddings and renormalize them. To match the graph density with the density of the underlying baseline we sample edges from a multinomial distribution, where each component is proportional to the cosine distance between a user and a datastore embeddings. Also we enforce the invariant that dynamic edges are always a subset of all permission edges. A synthetic graph generated in such

1 papers0 benchmarksGraphs

CellTypeGraph Benchmark

Classifying all cells in an organ is a relevant and difficult problem from plant developmental biology. We here abstract the problem into a new benchmark for node classification in a geo-referenced graph. Solving it requires learning the spatial layout of the organ including symmetries. To allow the convenient testing of new geometrical learning methods, the benchmark of Arabidopsis thaliana ovules is made available as a PyTorch data loader, along with a large number of precomputed features.

1 papers2 benchmarksGraphs

HTDM (Hypertention Disease Medication)

Hypertention Disease Medication dataset.

1 papers0 benchmarksGraphs, Medical

DEAP City Dataset

Main Dataset city_pollution_data.csv

1 papers0 benchmarksEnvironment, Graphs, Tabular, Time series

doges-dogaresse (Doges and dogaresse of the Venetian Republic)

This is the list of all doges of the Venetian Republic, as well as their wives, if there's a record that they existed. They include name, surname if known, and date of their office, as well as the date of their weddings. Data has been extracted from the Wikipedia, with some errors fixed checking against other sources.

1 papers0 benchmarksGraphs

SPAVE-28G (Signal Propagation Analyses in V2X Ecosystems (S.P.A.V.E) at 28 GHz on the NSF POWDER testbed)

Click to add a brief description of the dataset (Markdown and LaTeX enabled).

1 papers0 benchmarksEnvironment, Graphs, Time series, Tracking

DBP-5L (Spanish)

DPB-5L is a Multilingual KG dataset containing 5 KGs in English, French, Japanese, Greek, and Spanish. The dataset is used for the Knowledge Graph Completion and Entity Alignment task. DPB-5L (Spanish) is a subset of DPB-5L with Spanish KG.

1 papers0 benchmarksGraphs

pmuBAGE

pmuBAGE (the Benchmarking Assortment of Generated PMU Events) is a dataset that consists of almost 1000 instances of labeled event data to encourage benchmark evaluations on phasor measurement unit (PMU) data analytics. PMU data are challenging to obtain, especially those covering event periods. Nevertheless, power system problems have recently seen phenomenal advancements via data-driven machine learning solutions. A highly accessible standard benchmarking dataset would enable a drastic acceleration of the development of successful machine learning techniques in this field.

1 papers0 benchmarksGraphs

BeGin

BeGin provides 23 benchmark scenarios for graph from 14 real-world datasets, which cover 12 combinations of the incremental settings and the levels of problem. In addition, BeGin provides various basic evaluation metrics for measuring the performances and final evalution metrics designed for continual learning.

1 papers0 benchmarksGraphs

RoomEnv-v1 (The Room environment - v1)

The Room environment - v1

1 papers1 benchmarksGraphs, Texts

ZeroKBC

ZeroKBC is comprehensive benchmark that covers all scenarios of zero-shot Knowledge Base Completion (KBC) task. It has 3 zero-shot scenarios with 8 fine-grained settings.

1 papers0 benchmarksGraphs

2D_NACA_RANS

Dataset of low fidelity resolutions of the RANS equations over airfoils.

1 papers0 benchmarksGraphs, Physics, Point cloud

PACE 2022 Heuristic (PACE 2022 Directed Feedback Vertex Set, HeuristicTrack)

This is the set of graphs used in the PACE 2022 challenge for computing the Directed Feedback Vertex Set, from the Heuristic track. It consists of 200 labelled directed graphs. The graphs are mostly not symmetric (an edge form u->v does not imply an edge from v->u), although some are symmetric. The graph labels are integers ranging from 1 to N.

1 papers0 benchmarksGraphs

PreviousPage 12 of 15Next

Datasets

MuMiN-small

MuMiN-medium

MuMiN-large

ILPC22-Small

ILPC22-Large

RoomEnv-v0 (The Room environment - v0)

OUMVLP-Pose (Multi-View Large Population Database with Pose Sequence)

Identity Access Management dataset

CellTypeGraph Benchmark

HTDM (Hypertention Disease Medication)

DEAP City Dataset

doges-dogaresse (Doges and dogaresse of the Venetian Republic)

SPAVE-28G (Signal Propagation Analyses in V2X Ecosystems (S.P.A.V.E) at 28 GHz on the NSF POWDER testbed)

DBP-5L (Spanish)

pmuBAGE

BeGin

RoomEnv-v1 (The Room environment - v1)

ZeroKBC

2D_NACA_RANS

PACE 2022 Heuristic (PACE 2022 Directed Feedback Vertex Set, HeuristicTrack)

Datasets

MuMiN-small

MuMiN-medium

MuMiN-large

ILPC22-Small

ILPC22-Large

RoomEnv-v0 (The Room environment - v0)

OUMVLP-Pose (Multi-View Large Population Database with Pose Sequence)

Identity Access Management dataset

CellTypeGraph Benchmark

HTDM (Hypertention Disease Medication)

DEAP City Dataset

doges-dogaresse (Doges and dogaresse of the Venetian Republic)

SPAVE-28G (Signal Propagation Analyses in V2X Ecosystems (S.P.A.V.E) at 28 GHz on the NSF POWDER testbed)

DBP-5L (Spanish)

pmuBAGE

BeGin

RoomEnv-v1 (The Room environment - v1)

ZeroKBC

2D_NACA_RANS

PACE 2022 Heuristic (PACE 2022 Directed Feedback Vertex Set, HeuristicTrack)