Beyond tripeptides - two-step active machine learning for very large datasets

van Teijlingen, Alexander and Tuttle, Tell (2021) Beyond tripeptides - two-step active machine learning for very large datasets. Journal of Chemical Theory and Computation, 17 (5). pp. 3221-3232. ISSN 1549-9618 (https://doi.org/10.1021/acs.jctc.1c00159)

[thumbnail of Teijlingen-Tuttle-JCTC-2021-Beyond-tripeptides-two-step-active-machine-learning]

Preview

Text. Filename: Teijlingen_Tuttle_JCTC_2021_Beyond_tripeptides_two_step_active_machine_learning.pdf
Final Published Version
License:

Download (4MB)| Preview

Abstract

Self-assembling peptide nanostructures have been shown to be of great importance in nature and have presented many promising applications, for example, in medicine as drug-delivery vehicles, biosensors, and antivirals. Being very promising candidates for the growing field of bottom-up manufacture of functional nanomaterials, previous work (Frederix, et al. 2011 and 2015) has screened all possible amino acid combinations for di- and tripeptides in search of such materials. However, the enormous complexity and variety of linear combinations of the 20 amino acids make exhaustive simulation of all combinations of tetrapeptides and above infeasible. Therefore, we have developed an active machine-learning method (also known as "iterative learning" and "evolutionary search method") which leverages a lower-resolution data set encompassing the whole search space and a just-in-time high-resolution data set which further analyzes those target peptides selected by the lower-resolution model. This model uses newly generated data upon each iteration to improve both lower- and higher-resolution models in the search for ideal candidates. Curation of the lower-resolution data set is explored as a method to control the selected candidates, based on criteria such as log P. A major aim of this method is to produce the best results in the least computationally demanding way. This model has been developed to be broadly applicable to other search spaces with minor changes to the algorithm, allowing its use in other areas of research.

ORCID iDs

van Teijlingen, Alexander

and Tuttle, Tell;

Share and Export

Item metadata

Item type:	Article
ID code:	76132
Dates:	Date Event 11 May 2021 Published 27 April 2021 Published Online 14 April 2021 Accepted
Subjects:	Science > Chemistry
Department:	Faculty of Science > Pure and Applied Chemistry Technology and Innovation Centre > Bionanotechnology
Depositing user:	Pure Administrator
Date deposited:	15 Apr 2021 15:57
Last modified:	10 Oct 2024 00:28
URI:	https://strathprints.strath.ac.uk/id/eprint/76132

CORE (COnnecting REpositories)