Using machine learning to classify test outcomes

Roper, Richard; (2019) Using machine learning to classify test outcomes. In: 2019 IEEE International Conference On Artificial Intelligence Testing (AITest). IEEE, Piscataway, N.J., pp. 99-100. ISBN 978-1-7281-0492-8 (

[thumbnail of Roper-AITest-2019-Using-machine-learning-to-classify-test]
Text. Filename: Roper_AITest_2019_Using_machine_learning_to_classify_test.pdf
Accepted Author Manuscript

Download (88kB)| Preview


When testing software it has been shown that there are substantial benefits to be gained from approaches which exercise unusual or unexplored interactions with a system - techniques such as random testing, fuzzing, and exploratory testing. However, such approaches have a drawback in that the outputs of the tests need to be manually checked for correctness, representing a significant burden for the software engineer. This paper presents a strategy to support the process of identifying which tests have passed or failed by combining clustering and semi-supervised learning. We have shown that by using machine learning it is possible to cluster test cases in such a way that those corresponding to failures concentrate into smaller clusters. Examining the test outcomes in cluster-size order has the effect of prioritising the results: those that are checked early on have a much higher probability of being a failing test. As the software engineer examines the results (and confirms or refutes the initial classification), this information is employed to bootstrap a secondary learner to further improve the accuracy of the classification of the (as yet) unchecked tests. Results from experimenting with a range of systems demonstrate the substantial benefits that can be gained from this strategy, and how remarkably accurate test output classifications can be derived from examining a relatively small proportion of results.