UTSD
Unified Time Series Dataset
Unified Time Series Dataset (UTSD) includes 7 domains with up to 1 billion time points with hierarchical capacities to facilitate research of large models in the field of time series. It is meticulously assembled from a blend of publicly accessible online data repositories and empirical data derived from real-world machine operations. We analyze each dataset within the collection, examining the time series through the lenses of stationarity and forecastability to allows us to characterize the level of complexity inherent to each dataset.
All datasets are classified into seven distinct domains by their source: Energy, Environment, Health, Internet of Things (IoT), Nature, Transportation, and Web with diverse sampling frequencies. UTSD is constructed with hierarchical capacities, namely UTSD-1G, UTSD-2G, UTSD-4G, and UTSD-12G, where each smaller dataset is a subset of the larger ones. A larger subset means greater data difficulty and diversity, allowing you to conduct detailed scaling experiments.