Datasets

3,275 machine learning datasets

3,275 dataset results

ScanBank

ScanBank is a benchmark dataset for figure extraction from scanned electronic theses and dissertations containing 10 thousand scanned page images, manually labeled by humans as to the presence of the 3.3 thousand figures or tables found therein.

2 papers0 benchmarksImages

Global Wheat Head 2021 (Global Wheat Head Dataset 2021)

Global WHEAT Dataset 2021 is the extentions of the Global Wheat Dataset 2020. It is the first large-scale dataset for wheat head detection from field optical images. It included a very large range of cultivars from differents continents. Wheat is a staple crop grown all over the world and consequently interest in wheat phenotyping spans the globe. Therefore, it is important that models developed for wheat phenotyping, such as wheat head detection networks, generalize between different growing environments around the world.

2 papers0 benchmarksImages

RaidaR (RaidaR: A Rich Annotated Image Dataset of Rainy Street Scenes)

RaidaR is a rich annotated image dataset of rainy street scenes. RaidaR consists of 58,542 real rainy images containing several rain-induced artifacts: fog, droplets, road reflections, etc. 5,000/3,658 images were carefully semantic/instance segmentated, respectively.

2 papers0 benchmarksImages

CalCROP21

CalCROP21 is a georeferenced multi-spectral dataset of satellite Imagery and crop labels. It is a semantic segmentation benchmark dataset, for the diverse crops in the Central Valley region of California at 10m spatial resolution using a Google Earth Engine based robust image processing pipeline.

2 papers0 benchmarksImages

DONeRF: Evaluation Dataset

This is the dataset for the CGF 2021 paper "DONeRF: Towards Real-Time Rendering of Compact Neural Radiance Fields using Depth Oracle Networks".

2 papers1 benchmarksImages, RGB-D

Bizarre Pose Dataset (Bizarre Pose Dataset of Illustrated Characters)

Human keypoint dataset of anime/manga-style character illustrations. Extension of the AnimeDrawingsDataset, with additional features:

2 papers0 benchmarksImages

MSDA (Multi-source domain adaptation dataset for text recognition)

5 domains: synthetic domain, document domain, street view domain, handwritten domain, and car license domain over five million images

2 papers4 benchmarksImages, Texts

Shadow Accrual Maps

Large-scale shadows from buildings in a city play an important role in determining the environmental quality of public spaces. They can be both beneficial, such as for pedestrians during summer, and detrimental, by impacting vegetation and by blocking direct sunlight. Determining the effects of shadows requires the accumulation of shadows over time across different periods in a year. In our paper Shadow Accrual Maps: Efficient Accumulation of City-Scale Shadows over Time, we present a simple yet efficient class of approach that uses the properties of sun movement to track the changing position of shadows within a fixed time interval. This repository presents the computed shadow information for New York City, Chicago, Los Angeles, Boston and Washington DC.

2 papers0 benchmarks3D, Environment, Images

Pano3D

Pano3D is a new benchmark for depth estimation from spherical panoramas. Its goal is to drive progress for this task in a consistent and holistic manner. The Pano3D 360 depth estimation benchmark provides a standard Matterport3D train and test split, as well as a secondary GibsonV2 partioning for testing and training as well. The latter is used for zero-shot cross dataset transfer performance assessment and decomposes it into 3 different splits, each one focusing on a specific generalization axis.

2 papers0 benchmarks3D, Images, Point cloud, RGB-D

Fashion-MMT

Fashion-MNT is large-scale bilingual product description dataset called Fashion-MMT, which contains over 114k noisy and 40k manually cleaned description translations with multiple product images.

2 papers0 benchmarksImages, Texts

MuCo-VQA

MuCo-VQA consist of large-scale (3.7M) multilingual and code-mixed VQA datasets in multiple languages: Hindi (hi), Bengali (bn), Spanish (es), German (de), French (fr) and code-mixed language pairs: en-hi, en-bn, en-fr, en-de and en-es.

2 papers0 benchmarksImages, Texts, Videos

Saint Gall

Saint Gall dataset contains handwritten historical manuscripts written in Latin that date back to the 9th century. It consists of 60 pages, 1 410 text lines and 11 597 words.

2 papers2 benchmarksImages, Texts

MuViHand

MuViHand is a dataset for 3D Hand Pose Estimation that consists of multi-view videos of the hand along with ground-truth 3D pose labels. The dataset includes more than 402,000 synthetic hand images available in 4,560 videos. The videos have been simultaneously captured from six different angles with complex backgrounds and random levels of dynamic lighting. The data has been captured from 10 distinct animated subjects using 12 cameras in a semi-circle topology.

2 papers0 benchmarksImages

VVAD-LRS3

A dataset for Visual Voice Activity Detection extracted from the LRS3 dataset.

2 papers0 benchmarksImages

OpenViDial 2.0

OpenViDial 2.0 is a larger-scale open-domain multi-modal dialogue dataset compared to the previous version OpenViDial 1.0. OpenViDial 2.0 contains a total number of 5.6 million dialogue turns extracted from either movies or TV series from different resources, and each dialogue turn is paired with its corresponding visual context.

2 papers20 benchmarksImages, Texts

Mila Simulated Floods

Mila Simulated Floods Dataset is a 1.5 square km virtual world using the Unity3D game engine including urban, suburban and rural areas.

2 papers2 benchmarksImages, RGB-D

EDFace-Celeb-1M

EDFace-Celeb-1M is a public Ethnically Diverse Face dataset which is used to benchmark the task of face hallucination. The dataset includes 1.7 million photos that cover different countries, with balanced race composition.

2 papers0 benchmarksImages

ASIRRA ((Animal Species Image Recognition for Restricting Access)

Web services are often protected with a challenge that's supposed to be easy for people to solve, but difficult for computers. Such a challenge is often called a CAPTCHA (Completely Automated Public Turing test to tell Computers and Humans Apart) or HIP (Human Interactive Proof). HIPs are used for many purposes, such as to reduce email and blog spam and prevent brute-force attacks on web site passwords.

2 papers0 benchmarksImages

VFR-Wild

325 word images intended for font recognition, whose fonts are included in VFR-447 (and VFR-2420).

2 papers5 benchmarksImages

HC18

Automated measurement of fetal head circumference using 2D ultrasound images

2 papers0 benchmarksImages

PreviousPage 97 of 164Next