Explainable deep learning approach for multilabel classification of antimicrobial resistance with missing labels

Tharmakulasingam, Mukunthan and Gardner, Brian and Ragione, Roberto La and Fernando, Anil (2022) Explainable deep learning approach for multilabel classification of antimicrobial resistance with missing labels. IEEE Access, 10. pp. 113073-113085. ISSN 2169-3536 (https://doi.org/10.1109/access.2022.3216896)

[thumbnail of Tharmakulasingam-etal-IEEEA-2022-Explainable-deep-learning-approach-for-multilabel-classification-of-antimicrobial-resistance]
Preview
Text. Filename: Tharmakulasingam_etal_IEEEA_2022_Explainable_deep_learning_approach_for_multilabel_classification_of_antimicrobial_resistance.pdf
Final Published Version
License: Creative Commons Attribution 4.0 logo

Download (7MB)| Preview

Abstract

Predicting Antimicrobial Resistance (AMR) from genomic sequence data has become a significant component of overcoming the AMR challenge, especially given its potential for facilitating more rapid diagnostics and personalised antibiotic treatments. With the recent advances in sequencing technologies and computing power, deep learning models for genomic sequence data have been widely adopted to predict AMR more reliably and error-free. There are many different types of AMR; therefore, any practical AMR prediction system must be able to identify multiple AMRs present in a genomic sequence. Unfortunately, most genomic sequence datasets do not have all the labels marked, thereby making a deep learning modelling approach challenging owing to its reliance on labels for reliability and accuracy. This paper addresses this issue by presenting an effective deep learning solution, Mask-Loss 1D convolution neural network (ML-ConvNet), for AMR prediction on datasets with many missing labels. The core component of ML- ConvNet utilises a masked loss function that overcomes the effect of missing labels in predicting AMR. The proposed ML-ConvNet is demonstrated to outperform state-of-the-art methods in the literature by 10.5%, according to the F1 score. The proposed model’s performance is evaluated using different degrees of the missing label and is found to outperform the conventional approach by 76% in the F1 score when 86.68% of labels are missing. Furthermore, the ML-ConvNet was established with an explainable artificial intelligence (XAI) pipeline, thereby making it ideally suited for hospital and healthcare settings, where model interpretability is an essential requirement.