Picture of athlete cycling

Open Access research with a real impact on health...

The Strathprints institutional repository is a digital archive of University of Strathclyde's Open Access research outputs. Strathprints provides access to thousands of Open Access research papers by Strathclyde researchers, including by researchers from the Physical Activity for Health Group based within the School of Psychological Sciences & Health. Research here seeks to better understand how and why physical activity improves health, gain a better understanding of the amount, intensity, and type of physical activity needed for health benefits, and evaluate the effect of interventions to promote physical activity.

Explore open research content by Physical Activity for Health...

A chemometric study of chromatograms of tea extracts by correlation optimization warping in conjunction with PCA, support vector machines and random forest data modeling

Zheng, L. and Watson, D.G. and Johnston, B.F. and Clark, Rachael L. and Edrada-Ebel, Ruangelie and Elseheri, W. (2009) A chemometric study of chromatograms of tea extracts by correlation optimization warping in conjunction with PCA, support vector machines and random forest data modeling. Analytica Chimica Acta, 624 (1-2). pp. 257-265. ISSN 0003-2670

Full text not available in this repository. Request a copy from the Strathclyde author

Abstract

A reverse phase high performance liquid chromatography (HPLC) separation was established for profiling water soluble compounds in extracts from tea. Whole chromatograms were pre-processed by techniques including baseline correction, binning and normalisation. In addition, peak alignment by correction of retention time shifts was performed using correlation optimization warping (COW) producing a correlation score of 0.96. To extract the chemically relevant information from the data, a variety of chemometric approaches were employed. Principle component analysis (PCA) was used to group the tea samples according to their chromatographic differences. Three principal components (PCs) described 78% of the total variance after peak alignment (64% before) and analysis of the score and loading plots provided insight into the main chemical differences between the samples. Finally, PCA, support vector machines (SVMs) and random forest (RF) machine learning methods were evaluated comparatively on their ability to predict unknown tea samples using models constructed from a predetermined training set. The best predictions of identity were obtained by using RF.