Missing data in wind farm time series : properties and effect on forecasts

Tawn, Rosemary and Browell, Jethro and Dinwoodie, Iain (2020) Missing data in wind farm time series : properties and effect on forecasts. Electric Power Systems Research, 189. 106640. ISSN 0378-7796 (https://doi.org/10.1016/j.epsr.2020.106640)

[thumbnail of Tawn-etal-EPSR-2020-Missing-data-in-wind-farm-time-series]
Text. Filename: Tawn_etal_EPSR_2020_Missing_data_in_wind_farm_time_series.pdf
Accepted Author Manuscript
License: Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 logo

Download (250kB)| Preview


Missing or corrupt data is common in real-world datasets; this affects the estimation and operation of analytical models where completeness is assumed or required. Statistical wind power forecasts utilise recent turbine data as model inputs, and must therefore be robust to missing data. We find that wind power data is ‘missing not at random’, with missing patterns also related to the forecast output. Approaches for dealing with this missing data in training and operation are proposed and evaluated through a case study, leading to a suggested forecasting methodology in the presence of missing data. In the training set, missing data was found to have significant negative impact on performance if simply omitted but this can be almost completely mitigated using multiple imputation. Greater increase in forecast errors is seen when input data are missing operationally, and retraining forecast models using the remaining inputs is found to be preferable to imputation.