Datasets

3,275 machine learning datasets

3,275 dataset results

NHA12D (A New Pavement Crack Dataset)

NHA12D is an annotated pavement crack dataset that contains images with different viewpoints and pavements types. This dataset is composed of 80 pavement images, including 40 concrete pavement images and 40 asphalt pavement images, captured by digital survey vehicles on the A12 network in the UK.

4 papers0 benchmarksImages

CelebA+masks

The COVID-19 pandemic raises the problem of adapting face recognition systems to the new reality, where people may wear surgical masks to cover their noses and mouths. Traditional data sets (e.g., CelebA, CASIA-WebFace) used for training these systems were released before the pandemic, so they now seem unsuited due to the lack of examples of people wearing masks. We propose a method for enhancing data sets containing faces without masks by creating synthetic masks and overlaying them on faces in the original images. Our method relies on Spark AR Studio, a developer program made by Facebook that is used to create Instagram face filters. In our approach, we use 9 masks of different colors, shapes and fabrics. We employ our method to generate a number of 196,254 (96.8%) masks for the CelebA data set.

4 papers6 benchmarksImages

CASIA-WebFace+masks

4 papers6 benchmarksImages

SWORD ('Scenes with occluded regions' dataset)

The new dataset contains around 1,500 train videos and 290 test videos, with 50 frames per video on average. The dataset was obtained after processing the manually captured video sequences of static real-life urban scenes. The main property of the dataset is the abundance of close objects and, consequently, the larger prevalence of occlusions. According to the introduced heuristic, the mean area of occluded image parts for SWORD is approximately five times larger than for RealEstate10k data (14% vs 3% respectively). This rationalizes the collection and usage of SWORD and explains that SWORD allows training more powerful models despite being of smaller size.

4 papers3 benchmarks3D, Images, Videos

CARLANE Benchmark

Unsupervised Domain Adaptation demonstrates great potential to mitigate domain shifts by transferring models from labeled source domains to unlabeled target domains. While Unsupervised Domain Adaptation has been applied to a wide variety of complex vision tasks, only few works focus on lane detection for autonomous driving. This can be attributed to the lack of publicly available datasets. To facilitate research in these directions, we propose CARLANE, a 3-way sim-to-real domain adaptation benchmark for 2D lane detection. CARLANE encompasses the single-target datasets MoLane and TuLane and the multi-target dataset MuLane. These datasets are built from three different domains, which cover diverse scenes and contain a total of 163K unique images, 118K of which are annotated. In addition we evaluate and report systematic baselines, including our own method, which builds upon Prototypical Cross-domain Self-supervised Learning. We find that false positive and false negative rates of the eva

4 papers0 benchmarksImages

1,995 People Face Images Data (Asian race)

Description: 1,995 People Face Images Data (Asian race). For each subject, more than 20 images per person with frontal face were collected. This data can be used for face recognition and other tasks.

4 papers0 benchmarksImages

FewSOL (A Dataset for Few-Shot Object Learning in Robotic Environments)

The Few-Shot Object Learning (FewSOL) dataset can be used for object recognition with a few images per object. It contains 336 real-world objects with 9 RGB-D images per object from different views. Object segmentation masks, object poses and object attributes are provided. In addition, synthetic images generated using 330 3D object models are used to augment the dataset. FewSOL dataset can be used to study a set of few-shot object recognition problems such as classification, detection and segmentation, shape reconstruction, pose estimation, keypoint correspondences and attribute recognition.

4 papers0 benchmarks6D, Images, RGB-D, Texts

Oracle-MNIST (Oracle-MNIST: a Realistic Image Dataset for Benchmarking Machine Learning Algorithms)

We introduce the Oracle-MNIST dataset, comprising of 2828 grayscale images of 30,222 ancient characters from 10 categories, for benchmarking pattern classification, with particular challenges on image noise and distortion. The training set totally consists of 27,222 images, and the test set contains 300 images per class. Oracle-MNIST shares the same data format with the original MNIST dataset, allowing for direct compatibility with all existing classifiers and systems, but it constitutes a more challenging classification task than MNIST. The images of ancient characters suffer from 1) extremely serious and unique noises caused by three-thousand years of burial and aging and 2) dramatically variant writing styles by ancient Chinese, which all make them realistic for machine learning research. The dataset is freely available at https://github.com/wm-bupt/oracle-mnist.

4 papers2 benchmarksImages

K-Lane (KAIST-Lane)

KAIST-Lane (K-Lane) is the world’s first and the largest public urban road and highway lane dataset for Lidar. K-Lane has more than 15K frames and contains annotations of up to six lanes under various road and traffic conditions, e.g., occluded roads of multiple occlusion levels, roads at day and night times, merging (converging and diverging) and curved lanes.

4 papers2 benchmarksImages, Point cloud

MUAD (Multiple Uncertainties for Autonomous Driving)

The MUAD dataset (Multiple Uncertainties for Autonomous Driving), consisting of 10,413 realistic synthetic images with diverse adverse weather conditions (night, fog, rain, snow), out-of-distribution objects, and annotations for semantic segmentation, depth estimation, object, and instance detection. Predictive uncertainty estimation is essential for the safe deployment of Deep Neural Networks in real-world autonomous systems and MUAD allows to a better assess the impact of different sources of uncertainty on model performance.

4 papers0 benchmarksEnvironment, Images, RGB-D

Separated COCO

Separated COCO is automatically generated subsets of COCO val dataset, collecting separated objects for a large variety of categories in real images in a scalable manner, where target object segmentation mask is separated into distinct regions by the occluder.

4 papers1 benchmarksImages

ALTO (Aerial-view Large-scale Terrain-Oriented)

ALTO is a vision-focused dataset for the development and benchmarking of Visual Place Recognition and Localization methods for Unmanned Aerial Vehicles. The dataset is composed of two long (approximately 150km and 260km) trajectories flown by a helicopter over Ohio and Pennsylvania, and it includes high precision GPS-INS ground truth location data, high precision accelerometer readings, laser altimeter readings, and RGB downward facing camera imagery.The dataset also comes with reference imagery over the flight paths, which makes this dataset suitable for VPR benchmarking and other tasks common in Localization, such as image registration and visual odometry.

4 papers0 benchmarksImages

PDEBench - Benchmark for Scientific Machine Learning

PDEBench provides a diverse and comprehensive set of benchmarks for scientific machine learning, including challenging and realistic physical problems. The repository consists of the code used to generate the datasets, to upload and download the datasets from the data repository, as well as to train and evaluate different machine learning models as baseline. PDEBench features a much wider range of PDEs than existing benchmarks and included realistic and difficult problems (both forward and inverse), larger ready-to-use datasets comprising various initial and boundary conditions, and PDE parameters. Moreover, PDEBench was crated to make the source code extensible and we invite active participation to improve and extent the benchmark.

4 papers0 benchmarksImages, Physics, Time series, Videos

SDOML

Machine-learning Data Set Prepared from NASA Solar Dynamics Observatory Mission data.

4 papers0 benchmarksImages

WIKIPerson

WIKIPerson is a high-quality human-annotated visual person linking dataset based on Wikipedia. The dataset contains a total of 48k different news images, covering 13k out of 120K Person named entities, each of which corresponds to a celebrity in Wikipedia. Unlike previously commonly-used datasets in EL, the mention in WIKIPerson is only an image containing the person entity with its bounding box. The corresponding label identifies a unique entity in Wikipedia. For each entity in the Wikipedia, we provide textual descriptions as well as images to satisfy the need of three sub-tasks.

4 papers0 benchmarksImages, Texts

S3E

S3E is a novel large-scale multimodal dataset captured by a fleet of unmanned ground vehicles along four designed collaborative trajectory paradigms. S3E consists of 7 outdoor and 5 indoor scenes that each exceed 200 seconds, consisting of well synchronized and calibrated high-quality stereo camera, LiDAR, and high-frequency IMU data.

4 papers0 benchmarksImages, LiDAR, Point cloud

VASR (Visual Analogies of Situation Recognition)

Visual Analogies of Situation Recognition (VASR) is a dataset for visual analogical mapping, adapting the classical word-analogy task into the visual domain. It contains 196K object transitions and 385K activity transitions. Experiments demonstrate that state-of-the-art models do well when distractors are chosen randomly (~86%), but struggle with carefully chosen distractors (~53%, compared to 90% human accuracy)

4 papers1 benchmarksImages

PreviousPage 77 of 164Next

Datasets

NHA12D (A New Pavement Crack Dataset)

CelebA+masks

CASIA-WebFace+masks

SWORD ('Scenes with occluded regions' dataset)

CARLANE Benchmark

1,995 People Face Images Data (Asian race)

FewSOL (A Dataset for Few-Shot Object Learning in Robotic Environments)

Oracle-MNIST (Oracle-MNIST: a Realistic Image Dataset for Benchmarking Machine Learning Algorithms)

K-Lane (KAIST-Lane)

MUAD (Multiple Uncertainties for Autonomous Driving)

Separated COCO

ALTO (Aerial-view Large-scale Terrain-Oriented)

PDEBench - Benchmark for Scientific Machine Learning

SDOML

WIKIPerson

S3E

VASR (Visual Analogies of Situation Recognition)

MPV (Multi-Pose Virtual try on)

FES (Fisheye Evaluation Suite)

CIFAKE: Real and AI-Generated Synthetic Images

Datasets

NHA12D (A New Pavement Crack Dataset)

CelebA+masks

CASIA-WebFace+masks

SWORD ('Scenes with occluded regions' dataset)

CARLANE Benchmark

1,995 People Face Images Data (Asian race)

FewSOL (A Dataset for Few-Shot Object Learning in Robotic Environments)

Oracle-MNIST (Oracle-MNIST: a Realistic Image Dataset for Benchmarking Machine Learning Algorithms)

K-Lane (KAIST-Lane)

MUAD (Multiple Uncertainties for Autonomous Driving)

Separated COCO

ALTO (Aerial-view Large-scale Terrain-Oriented)

PDEBench - Benchmark for Scientific Machine Learning

SDOML

WIKIPerson

S3E

VASR (Visual Analogies of Situation Recognition)

MPV (Multi-Pose Virtual try on)

FES (Fisheye Evaluation Suite)

CIFAKE: Real and AI-Generated Synthetic Images