Datasets

148 machine learning datasets

148 dataset results

RLU (RL Unplugged)

RL Unplugged is suite of benchmarks for offline reinforcement learning. The RL Unplugged is designed around the following considerations: to facilitate ease of use, we provide the datasets with a unified API which makes it easy for the practitioner to work with all data in the suite once a general pipeline has been established. This is a dataset accompanying the paper RL Unplugged: Benchmarks for Offline Reinforcement Learning.

2 papers0 benchmarksActions, Environment, Images, Physics, RGB Video, Replay data

CinemAirSim

CinemAirSim is an extension of the well-known drone simulator, AirSim, with a cinematic camera as well as extended its API to control all of its parameters in real time, including various filming lenses and common cinematographic properties.

2 papers0 benchmarksEnvironment

TI1K Dataset (Thumb Index 1000 Hand & Fingertip Detection Dataset)

Thumb Index 1000 (TI1K) is a dataset of 1000 hand images with the hand bounding box, and thumb and index fingertip positions. The dataset includes the natural movement of the thumb and index fingers making it suitable for mixed reality (MR) applications.

2 papers0 benchmarksActions, Environment, Images, RGB Video

Shadow Accrual Maps

Large-scale shadows from buildings in a city play an important role in determining the environmental quality of public spaces. They can be both beneficial, such as for pedestrians during summer, and detrimental, by impacting vegetation and by blocking direct sunlight. Determining the effects of shadows requires the accumulation of shadows over time across different periods in a year. In our paper Shadow Accrual Maps: Efficient Accumulation of City-Scale Shadows over Time, we present a simple yet efficient class of approach that uses the properties of sun movement to track the changing position of shadows within a fixed time interval. This repository presents the computed shadow information for New York City, Chicago, Los Angeles, Boston and Washington DC.

2 papers0 benchmarks3D, Environment, Images

TOAD-GAN

A procedurally generated jump'n'run game with control over level similarity.

2 papers0 benchmarksEnvironment

RNADesign

An environment for RNA design given structure constraints with structures from different datasets to choose from.

2 papers0 benchmarksEnvironment

Multirotor-Gym

Multirotor gym environment for learning control policies for various unmanned aerial vehicles.

2 papers0 benchmarksEnvironment, Physics

SuperCaustics

SuperCaustics is a simulation tool made in Unreal Engine for generating massive computer vision datasets that include transparent objects.

2 papers0 benchmarksEnvironment, Images, Interactive, LiDAR, Physics, RGB Video, RGB-D

Cyclone Data (global cyclone data from 1841 to 2021)

Archive of Global Tropical Cyclone Tracks Tracks from 1980 to May 2019.

2 papers0 benchmarksEnvironment

HeriGraph (Multimodal Machine Learning Datasets on Graphs of Heritage Values and Attributes)

The dataset contains constructed multi-modal features (visual and textual), pseudo-labels (on heritage values and attributes), and graph structures (with temporal, social, and spatial links) constructed using User-Generated Content data collected from Flickr social media platform in three global cities containing UNESCO World Heritage property (Amsterdam, Suzhou, Venice). The motivation of data collection in this project is to provide datasets that could be both directly applicable for ML communities as test-bed, and theoretically informative for heritage and urban scholars to draw conclusions on for planning decision-making.

2 papers0 benchmarksEnvironment, Graphs, Images, Texts

IBISCape

A Simulated Benchmark for multi-modal SLAM Systems Evaluation in Large-scale Dynamic Environments.

2 papers0 benchmarksEnvironment, Images, Point cloud, RGB Video, RGB-D, Stereo, Videos

The Game of 2048

The 2048 game task involves training an agent to achieve high scores in the game 2048 (Wikipedia)

2 papers1 benchmarksEnvironment

bipedal-skills (Bipedal Skills Benchmark for Reinforcement Learning)

The bipedal skills benchmark is a suite of reinforcement learning environments implemented for the MuJoCo physics simulator. It aims to provide a set of tasks that demand a variety of motor skills beyond locomotion, and is intended for evaluating skill discovery and hierarchical learning methods. The majority of tasks exhibit a sparse reward structure.

2 papers0 benchmarksEnvironment, Interactive

xView3-SAR

Unsustainable fishing practices worldwide pose a major threat to marine resources and ecosystems. Identifying vessels that do not show up in conventional monitoring systems---known as ``dark vessels''---is key to managing and securing the health of marine environments. With the rise of satellite-based synthetic aperture radar (SAR) imaging and modern machine learning (ML), it is now possible to automate detection of dark vessels day or night, under all-weather conditions. SAR images, however, require a domain-specific treatment and are not widely accessible to the ML community. Maritime objects (vessels and offshore infrastructure) are relatively small and sparse, challenging traditional computer vision approaches. We present the largest labeled dataset for training ML models to detect and characterize vessels and ocean structures in SAR imagery. xView3-SAR consists of nearly 1,000 analysis-ready SAR images from the Sentinel-1 mission that are, on average, 29,400-by-24,400 pixels each.

2 papers1 benchmarksEnvironment, Images

CaFFe (CAlving Fronts and where to Find thEm)

The temporal variability in calving front positions of marine-terminating glaciers permits inference on the frontal ablation. Frontal ablation, the sum of the calving rate and the melt rate at the terminus, significantly contributes to the mass balance of glaciers. Therefore, the glacier area has been declared as an Essential Climate Variable product by the World Meteorological Organization. The presented dataset provides the necessary information for training deep learning techniques to automate the process of calving front delineation. The dataset includes Synthetic Aperture Radar (SAR) images of seven glaciers distributed around the globe. Five of them are located in Antarctica: Crane, Dinsmoore-Bombardier-Edgeworth, Mapple, Jorum and the Sjörgen-Inlet Glacier. The remaining glaciers are the Jakobshavn Isbrae Glacier in Greenland and the Columbia Glacier in Alaska. Several images were taken for each glacier, forming a time series. The time series lie in the time span between 1995 an

2 papers1 benchmarksEnvironment, Images, Time series

Wastewater catchment areas in Great Britain

Wastewater catchment area data are essential for wastewater treatment capacity planning and have recently become critical for operationalising wastewater-based epidemiology (WBE) for COVID-19. Owing to the privatised nature of the water industry in the United Kingdom, the required catchment area datasets are not readily available to researchers. Here, we present a consolidated dataset of 7,537 catchment areas from ten sewerage service providers in the Great Britain, covering more than 96% of the population of England and Wales.

2 papers0 benchmarksEnvironment

ClimateIQA

The dataset was created to address the crucial need for effective Extreme Weather Events Detection (EWED), an increasingly urgent task due to the rising frequency of such events driven by global warming. Traditional methods for EWED rely on numerical threshold setting and the analysis of weather anomaly heatmaps, visualizing data such as temperature, wind speed, and precipitation. However, these methods often involve manual work and can be time-consuming and error-prone. While advances in AI have led to the development of machine learning models like Convolutional Neural Networks (CNNs) for weather prediction and EWED, these models predominantly use numeric data and often yield low accuracy. Moreover, despite the proficiency of Large Language Models (LLMs) in generating textual weather reports, they struggle with interpreting visual data—crucial for EWED. General Vision-Language Models (VLMs) also face challenges in accurately interpreting meteorological heatmaps, commonly misidentifyi

2 papers0 benchmarksEnvironment, Images, Texts

MMVR (Millimeter-wave Multi-View Radar (MMVR) Dataset)

Click to add a brief description of the dataset (Markdown and LaTeX enabled).

2 papers0 benchmarksEnvironment

C2A: Human Detection in Disaster Scenarios (Combination to Application)

C2A: Combination to Application Dataset Overview This repository contains the code and information for the paper "UAV-Enhanced Combination to Application: Comprehensive Analysis and Benchmarking of a Human Detection Dataset for Disaster Scenarios" by Ragib Amin Nihal, Benjamin Yen, Katsutoshi Itoyama, and Kazuhiro Nakadai.

2 papers5 benchmarksEnvironment, Images

BASEPROD (The Bardenas Semi-Desert Planetary Rover Dataset)

BASEPROD provides comprehensive rover sensor data collected over a 1.7 km traverse, accompanied by high-resolution 2D and 3D drone maps of the terrain. The dataset also includes laser-induced breakdown spectroscopy (LIBS) measurements from key sampling sites along the rover's path, as well as weather station data to contextualize environmental conditions.

2 papers0 benchmarks3D, Environment, Images, Point cloud, RGB-D, Stereo, Tabular, Time series

PreviousPage 5 of 8Next