Datasets

19,997 machine learning datasets

19,997 dataset results

BASIL

300 news articles annotated with 1,727 bias spans and find evidence that informational bias appears in news articles more frequently than lexical bias.

27 papers0 benchmarks

Evidence Inference

Evidence Inference is a corpus for this task comprising 10,000+ prompts coupled with full-text articles describing RCTs.

27 papers0 benchmarksTexts

IntrA is an open-access 3D intracranial aneurysm dataset that makes the application of points-based and mesh-based classification and segmentation models available. This dataset can be used to diagnose intracranial aneurysms and to extract the neck for a clipping operation in medicine and other areas of deep learning, such as normal estimation and surface reconstruction.

27 papers11 benchmarksImages

KPTimes

KPTimes is a large-scale dataset of news texts paired with editor-curated keyphrases.

27 papers7 benchmarksTexts

TG-ReDial

TG-ReDial is a a topic-guided conversational recommendation dataset for research on conversational/interactive recommender systems.

27 papers0 benchmarksTexts

VQA-HAT (VQA Human Attention)

VQA-HAT (Human ATtention) is a dataset to evaluate the informative regions of an image depending on the question being asked about it. The dataset consists of human visual attention maps over the images in the original VQA dataset. It contains more than 60k attention maps.

27 papers0 benchmarksImages

KoDF (Korean DeepFake Detection Dataset)

The Korean DeepFake Detection Dataset (KoDF) is a large-scale collection of synthesized and real videos focused on Korean subjects, used for the task of deepfake detection.

27 papers0 benchmarksVideos

RAVDESS (Ryerson Audio-Visual Database of Emotional Speech and Song)

The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS) contains 7,356 files (total size: 24.8 GB). The database contains 24 professional actors (12 female, 12 male), vocalizing two lexically-matched statements in a neutral North American accent. Speech includes calm, happy, sad, angry, fearful, surprise, and disgust expressions, and song contains calm, happy, sad, angry, and fearful emotions. Each expression is produced at two levels of emotional intensity (normal, strong), with an additional neutral expression. All conditions are available in three modality formats: Audio-only (16bit, 48kHz .wav), Audio-Video (720p H.264, AAC 48kHz, .mp4), and Video-only (no sound). Note, there are no song files for Actor_18.

27 papers21 benchmarksAudio, Speech, Videos

iNat2021 (iNaturalist 2021)

iNat2021 is a large-scale image dataset collected and annotated by community scientists that contains over 2.7M images from 10k different species.

27 papers0 benchmarksImages

RadarScenes

RadarScenes is a real-world radar point cloud dataset for automotive applications.

27 papers0 benchmarksImages, Point cloud

BAIR Robot Pushing

Dataset of 64x64 images of a robot pushing objects on a table top. From Berkeley AI Research (BAIR).

27 papers18 benchmarks

CaseHOLD (Case Holdings On Legal Decisions)

CaseHOLD (Case Holdings On Legal Decisions) is a law dataset comprised of over 53,000+ multiple choice questions to identify the relevant holding of a cited case. This dataset presents a fundamental task to lawyers and is both legally meaningful and difficult from an NLP perspective (F1 of 0.4 with a BiLSTM baseline). The citing context from the judicial decision serves as the prompt for the question. The answer choices are holding statements derived from citations following text in a legal decision. There are five answer choices for each citing text. The correct answer is the holding statement that corresponds to the citing text. The four incorrect answers are other holding statements.

27 papers3 benchmarksTexts

LHQ (Landscapes High-Quality)

A dataset of 90,000 high-resolution nature landscape images, crawled from Unsplash and Flickr and preprocessed with Mask R-CNN and Inception V3.

27 papers7 benchmarksImages

Nighttime Driving

Nighttime Driving is a dataset of road scenes consisting of 35,000 images ranging from daytime to twilight time and to nighttime.

27 papers3 benchmarksImages

GID (Gaofen Image Dataset)

Gaofen Image Dataset (GID) is a large-scale land-cover dataset constructed with Gaofen-2 (GF-2) satellite images. This dataset has superiorities over the existing land-cover dataset because of its large coverage, wide distribution, and high spatial resolution. It contains 150 GF-2 images annotated at the pixel level for 5 categories: built-up, farmland, forest, meadow, and water.

27 papers0 benchmarksImages

X4K1000FPS

Dataset of high-resolution (4096×2160), high-fps (1000fps) video frames with extreme motion. X-TEST consists of 15 video clips with 33-length of 4K-1000fps frames. X-TRAIN consists of 4,408 clips from various types of 110 scenes. The clips are 65-length of 1000fps frames

27 papers8 benchmarksVideos

ISPRS Potsdam (2D Semantic Labeling Contest - Potsdam)

The data set contains 38 patches (of the same size), each consisting of a true orthophoto (TOP) extracted from a larger TOP mosaic.

27 papers7 benchmarks

S2Looking

S2Looking is a building change detection dataset that contains large-scale side-looking satellite images captured at varying off-nadir angles. The S2Looking dataset consists of 5,000 registered bitemporal image pairs (size of 1024*1024, 0.5 ~ 0.8 m/pixel) of rural areas throughout the world and more than 65,920 annotated change instances. We provide two label maps to separately indicate the newly built and demolished building regions for each sample in the dataset. We establish a benchmark task based on this dataset, i.e., identifying the pixel-level building changes in the bi-temporal images.

27 papers7 benchmarksImages

UVO (Unidentified Video Objects: A Benchmark for Dense, Open-World Segmentation)

UVO is a new benchmark for open-world class-agnostic object segmentation in videos. Besides shifting the problem focus to the open-world setup, UVO is significantly larger, providing approximately 8 times more videos compared with DAVIS, and 7 times more mask (instance) annotations per video compared with YouTube-VOS and YouTube-VIS. UVO is also more challenging as it includes many videos with crowded scenes and complex background motions. Some highlights of the dataset include:

27 papers4 benchmarksImages, RGB Video, Videos

MedMNIST v2

MedMNIST v2 is a large-scale MNIST-like collection of standardized biomedical images, including 12 datasets for 2D and 6 datasets for 3D. All images are pre-processed into 28 x 28 (2D) or 28 x 28 x 28 (3D) with the corresponding classification labels, so that no background knowledge is required for users. Covering primary data modalities in biomedical images, MedMNIST v2 is designed to perform classification on lightweight 2D and 3D images with various data scales (from 100 to 100,000) and diverse tasks (binary/multi-class, ordinal regression and multi-label). The resulting dataset, consisting of 708,069 2D images and 10,214 3D images in total, could support numerous research / educational purposes in biomedical image analysis, computer vision and machine learning.

27 papers0 benchmarksImages

PreviousPage 86 of 1000Next