Human in the loop active learning for time-series electrical measurement data

Sobot, Tamara and Stankovic, Vladimir and Stankovic, Lina (2024) Human in the loop active learning for time-series electrical measurement data. Engineering Applications of Artificial Intelligence, 133 (Part F). 108589. ISSN 0952-1976 (https://doi.org/10.1016/j.engappai.2024.108589)

[thumbnail of Sobot-etal-EAAI-2024-Human-in-the-loop-active-learning-for-time-series-electrical-measurement-data]
Preview
Text. Filename: Sobot-etal-EAAI-2024-Human-in-the-loop-active-learning-for-time-series-electrical-measurement-data.pdf
Final Published Version
License: Creative Commons Attribution 4.0 logo

Download (2MB)| Preview

Abstract

Advanced machine learning algorithms require large datasets, along with good-quality labels to reach state-of-the-art performance. Although measurements themselves can often be easily available, the labelling process is usually a bottleneck. To address this, active learning approaches exploit the fact that different samples provide varying levels of information to the algorithm. However, these approaches often rely on several unrealistic assumptions - an oracle is assumed to provide error-free labels, all at the same cost and effort. We propose novel active learning-based methods for classification of time series measurements, typically obtained from sensors continuously measuring highly fluctuating environmental conditions including electricity consumption, and demonstrate their effectiveness for home energy management applications, where data labelling is a challenge. A new acquisition function is proposed, which accounts for both model and labelling uncertainty and class balancing. A stopping criterion is designed to stop the active learning process after an optimal point is achieved, to reduce labelling effort. We assess the effect of labelling errors on classification performance and propose two ways of mitigating their effects: (i) a re-labelling mechanism based on similarity of provided labels; (ii) a revised loss function based on confidence levels provided by experts. We validate our contributions for energy dissagregation task in a real-world scenario with three application domain experts. Our results show that the proposed methodology significantly improves performance of algorithms transferred to unseen domains with reduced number of labelled samples - from 61% reduction for dishwasher to 93% reduction for kettle.