19,997 machine learning datasets
19,997 dataset results
EPIC-SOUNDS is a large scale dataset of audio annotations capturing temporal extents and class labels within the audio stream of the egocentric videos from EPIC-KITCHENS-100. EPIC-SOUNDS includes 78.4k categorised and 39.2k non-categorised segments of audible events and actions, distributed across 44 classes.
The COCO-MLT is created from MS COCO-2017, containing 1,909 images from 80 classes. The maximum of training number per class is 1,128 and the minimum is 6. We use the test set of COCO2017 with 5,000 for evaluation. The ratio of head, medium, and tail classes is 22:33:25 in COCO-MLT.
We construct the long-tailed version of VOC from its 2012 train-val set. It contains 1,142 images from 20 classes, with a maximum of 775 images per class and a minimum of 4 images per class. The ratio of head, medium, and tail classes after splitting is 6:6:8. We evaluate the performance on VOC2007 test set with 4952 images.
RarePlanes is a unique open-source machine learning dataset from CosmiQ Works and AI.Reverie that incorporates both real and synthetically generated satellite imagery. The RarePlanes dataset specifically focuses on the value of AI.Reverie synthetic data to aid computer vision algorithms in their ability to automatically detect aircraft and their attributes in satellite imagery. Although other synthetic/real combination datasets exist, RarePlanes is the largest openly-available very-high resolution dataset built to test the value of synthetic data from an overhead perspective. Previous research has shown that synthetic data can reduce the amount of real training data needed and potentially improve performance for many tasks in the computer vision domain. The real portion of the dataset consists of 253 Maxar WorldView-3 satellite scenes spanning 112 locations and 2,142 km^2 with 14,700 hand-annotated aircraft. The accompanying synthetic dataset is generated via AI.Reverie’s novel simulat
Taxi flow data of New York City with grid 20x10.
CIRCLE is a dataset containing 10 hours of full-body reaching motion from 5 subjects across nine scenes, paired with ego-centric information of the environment represented in various forms, such as RGBD videos.
SemanticSTF is an adverse-weather point cloud dataset that provides dense point-level annotations and allows to study 3DSS under various adverse weather conditions. It contains 2,076 scans captured by a Velodyne HDL64 S3D LiDAR sensor from STF that cover various adverse weather conditions including 694 snowy, 637 dense-foggy, 631 light-foggy, and 114 rainy (all rainy LiDAR scans in STF).
Social Media User Sentiment Analysis Dataset. Each user comments are labeled with either positive (1), negative (2), or neutral (0).
MasakhaNEWS is a benchmark dataset for news topic classification covering 16 languages widely spoken in Africa.
Dataset used by ProPublica to assess and analyse the fairness of the COMPAS (Correctional Offender Management Profiling for Alternative Sanctions) software.
The ACNE04 dataset includes 3756 Chinese face images with Acne. The ACNE04 dataset includes the annotations of local lesion numbers and global acne severity based on Hayashi Criterion.
CommitPack is is a 4TB dataset of commits scraped from GitHub repositories that are permissively licensed.
GMOT-40 is the first public dense dataset for Generic Multiple Object Tracking (GMOT). It contains 40 carefully annotated sequences evenly distributed among 10 object categories. Beyond the data, a challenging protocal, one-shot GMOT, is adopted and a series of baseline algorithms is introduced. GMOT-40 is featured in
We managede to collect a real-world rain dataset, named RainDS, includinnumerousus image pairs in various lighting conditions and different scenes. Each pair contains four images: a rain streak image, a raindrop image, and an image including both types of rain, as well as their rain-free counterparts.
TBD
The first large-scale pose dataset containing objects of multiple super-categories, termed Multi-category Pose (MP-100). In total, MP-100 dataset covers 100 subcategories and 8 super-categories. Over 18K images and 20K annotations are collected from several popular 2D pose datasets, including COCO, 300W, AFLW, OneHand10K, DeepFasion2, AP-10K, MacaquePose, Vinegar Fly, Desert Locust, CUB-200, CarFusion, AnimalWeb and Keypoint-5
Large Multimodal Models (LMMs) such as GPT-4V and LLaVA have shown remarkable capabilities in visual reasoning with common image styles. However, their robustness against diverse style shifts, crucial for practical applications, remains largely unexplored. In this paper, we propose a new benchmark, BenchLMM, to assess the robustness of LMMs against three different styles: artistic image style, imaging sensor style, and application style, where each style has five sub-styles. Utilizing BenchLMM, we comprehensively evaluate state-of-the-art LMMs and reveal: 1) LMMs generally suffer performance degradation when working with other styles; 2) An LMM performs better than another model in common style does not guarantee its superior performance in other styles; 3) LMMs' reasoning capability can be enhanced by prompting LMMs to predict the style first, based on which we propose a versatile and training-free method for improving LMMs; 4) An intelligent LMM is expected to interpret the causes of
Urban-scene detection dataset that consists of five different weather conditions: daytime-sunny, night-sunny, dusk-rainy, daytime-foggy, and night-rainy. The images are collected from diverse weather datasets: Cityscapes, BDD-100k, FoggyCityscapes, and Adverse-Weather. The dataset is used to evaulate the model performance on the single-domain generalized object detection (Single-DGOD).
ParsiNLU is a comprehensive suite of high-level Natural Language Processing (NLP) tasks for the Persian language. It includes six key NLP tasks:
OPUS is a growing collection of translated texts from the web. In the OPUS project we try to convert and align free online data, to add linguistic annotation, and to provide the community with a publicly available parallel corpus. OPUS is based on open source products and the corpus is also delivered as an open content package. We used several tools to compile the current collection. All pre-processing is done automatically. No manual corrections have been carried out.