Recurrent neural networks for time domain modelling of FTIR spectra : application to brain tumour detection

Antoniou, Georgios and Conn, Justin J. A. and Smith, Benjamin R. and Brennan, Paul M. and Baker, Matthew J. and Palmer, David S. (2023) Recurrent neural networks for time domain modelling of FTIR spectra : application to brain tumour detection. Analyst, 148 (8). pp. 1770-1776. ISSN 0003-2654 (https://doi.org/10.1039/D2AN02041F)

[thumbnail of Antoniou-etal-Analyst-2023-Recurrent-neural-networks-for-time-domain-modelling-of-FTIR-spectra]
Preview
Text. Filename: Antoniou_etal_Analyst_2023_Recurrent_neural_networks_for_time_domain_modelling_of_FTIR_spectra.pdf
Final Published Version
License: Creative Commons Attribution 3.0 logo

Download (955kB)| Preview

Abstract

Attenuated total reflectance (ATR)-Fourier transform infrared (FTIR) spectroscopy alongside machine learning (ML) techniques is an emerging approach for the early detection of brain cancer in clinical practice. A crucial step in the acquisition of an IR spectrum is the transformation of the time domain signal from the biological sample to a frequency domain spectrum via a discrete Fourier transform. Further pre-processing of the spectrum is typically applied to reduce non-biological sample variance, and thus to improve subsequent analysis. However, the Fourier transformation is often assumed to be essential even though modelling of time domain data is common in other fields. We apply an inverse Fourier transform to frequency domain data to map these to the time domain. We use the transformed data to develop deep learning models utilising Recurrent Neural Networks (RNNs) to differentiate between brain cancer and control in a cohort of 1438 patients. The best performing model achieves a mean (cross-validated score) area under the receiver operating characteristic (ROC) curve (AUC) of 0.97 with sensitivity of 0.91 and specificity of 0.91. This is better than the optimal model trained on frequency domain data which achieves an AUC of 0.93 with sensitivity of 0.85 and specificity of 0.85. A dataset comprising 385 patient samples which were prospectively collected in the clinic is used to test a model defined with the best performing configuration and fit to the time domain. Its classification accuracy is found to be comparable to the gold-standard for this dataset demonstrating that RNNs can accurately classify disease states using spectroscopic data represented in the time domain.

ORCID iDs

Antoniou, Georgios, Conn, Justin J. A., Smith, Benjamin R., Brennan, Paul M., Baker, Matthew J. and Palmer, David S. ORCID logoORCID: https://orcid.org/0000-0003-4356-9144;