Wei Cao, Dong Wang, Jian Li, Hao Zhou, Lei LI, Yitan Li
Time series are widely used as signals in many classification/regression tasks. It is ubiquitous that time series contains many missing values. Given multiple correlated time series data, how to fill in missing values and to predict their class labels? Existing imputation methods often impose strong assumptions of the underlying data generating process, such as linear dynamics in the state space. In this paper, we propose BRITS, a novel method based on recurrent neural networks for missing value imputation in time series data. Our proposed method directly learns the missing values in a bidirectional recurrent dynamical system, without any specific assumption. The imputed values are treated as variables of RNN graph and can be effectively updated during the backpropagation.BRITS has three advantages: (a) it can handle multiple correlated missing values in time series; (b) it generalizes to time series with nonlinear dynamics underlying; (c) it provides a data-driven imputation procedure and applies to general settings with missing data.We evaluate our model on three real-world datasets, including an air quality dataset, a health-care data, and a localization data for human activity. Experiments show that our model outperforms the state-of-the-art methods in both imputation and classification/regression accuracies.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Imputation | UCI localization data | MAE (10% missing) | 0.219 | BRITS |
| Imputation | PhysioNet Challenge 2012 | MAE (10% of data as GT) | 0.281 | BRITS |
| Imputation | PEMS-SF | L2 Loss (10^-4) | 4.51 | BRITS (SingleRes) |
| Imputation | Basketball Players Movement | OOB Rate (10^−3) | 3.874 | BRITS (SingleRes) |
| Imputation | Basketball Players Movement | Path Difference | 0.571 | BRITS (SingleRes) |
| Imputation | Basketball Players Movement | Path Length | 0.702 | BRITS (SingleRes) |
| Imputation | Basketball Players Movement | Player Distance | 0.417 | BRITS (SingleRes) |
| Imputation | Basketball Players Movement | Step Change (10^−3) | 4.811 | BRITS (SingleRes) |
| Imputation | Beijing Multi-Site Air-Quality Dataset | MAE (PM2.5) | 11.56 | BRITS |
| Time Series Forecasting | USHCN-Daily | MSE | 0.53 | BRITS |
| Traffic Prediction | METR-LA Point Missing | MAE | 2.34 | BRITS |
| Traffic Prediction | PEMS-BAY Point Missing | MAE | 1.47 | BRITS |
| Feature Engineering | UCI localization data | MAE (10% missing) | 0.219 | BRITS |
| Feature Engineering | PhysioNet Challenge 2012 | MAE (10% of data as GT) | 0.281 | BRITS |
| Feature Engineering | PEMS-SF | L2 Loss (10^-4) | 4.51 | BRITS (SingleRes) |
| Feature Engineering | Basketball Players Movement | OOB Rate (10^−3) | 3.874 | BRITS (SingleRes) |
| Feature Engineering | Basketball Players Movement | Path Difference | 0.571 | BRITS (SingleRes) |
| Feature Engineering | Basketball Players Movement | Path Length | 0.702 | BRITS (SingleRes) |
| Feature Engineering | Basketball Players Movement | Player Distance | 0.417 | BRITS (SingleRes) |
| Feature Engineering | Basketball Players Movement | Step Change (10^−3) | 4.811 | BRITS (SingleRes) |
| Feature Engineering | Beijing Multi-Site Air-Quality Dataset | MAE (PM2.5) | 11.56 | BRITS |
| Time Series Analysis | USHCN-Daily | MSE | 0.53 | BRITS |
| Multivariate Time Series Forecasting | USHCN-Daily | MSE | 0.53 | BRITS |