Machine learning augmented diagnostic testing to identify sources of variability in test performance

Banks, Christopher Jon and Sanchez, Aeron and Stewart, Vicki and Bowen, Kate and Doherty, Thomas and Tearne, Oliver and Smith, Graham and Kao, Rowland R. (2025) Machine learning augmented diagnostic testing to identify sources of variability in test performance. PLoS Computational Biology, 21 (11). e1013651. ISSN 1553-7358 (https://doi.org/10.1371/journal.pcbi.1013651)

[thumbnail of Banks-etal-PLOSCB-2025-Machine-learning-augmented-diagnostic-testing-to-identify-sources-of-variability]
Preview
Text. Filename: Banks-etal-PLOSCB-2025-Machine-learning-augmented-diagnostic-testing-to-identify-sources-of-variability.pdf
Final Published Version
License: Creative Commons Attribution 4.0 logo

Download (1MB)| Preview

Abstract

Diagnostic tests that can detect pre-clinical or sub-clinical infection, are one of the most powerful tools in our armoury of weapons to control infectious diseases. Considerable effort has been paid to improving diagnostic testing for human, plant and animal diseases, including strategies for targeting the use of diagnostic tests towards individuals who are more likely to be infected. We use machine learning to assess the surrounding risk landscape under which a diagnostic test is applied to augment its interpretation. We develop this to predict the occurrence of bovine tuberculosis incidents in cattle herds, exploiting the availability of exceptionally detailed testing records. We show that, without compromising test specificity, test sensitivity can be improved so that the proportion of infected herds detected improves by over 5 percentage points, or 240 additional infected herds detected in one year beyond those detected by the skin test alone. We also use feature importance testing for assessing the weighting of risk factors. While many factors are associated with increased risk of incidents, of note are several factors that suggest that in some herds there is a higher risk of infection going undetected.