Machine learning derived correlations for scale-up and technology transfer of primary nucleation kinetics

Yerdelen, Stephanie and Yang, Yihui and Quon, Justin L. and Papageorgiou, Charles D. and Mitchell, Chris and Houson, Ian and Sefcik, Jan and ter Horst, Joop H. and Florence, Alastair J and Brown, Cameron J. (2023) Machine learning derived correlations for scale-up and technology transfer of primary nucleation kinetics. Crystal Growth and Design. ISSN 1528-7483 (

[thumbnail of Yerdelen-etal-CGD-2023-Machine-learning-derived-correlations-for-scale-up]
Text. Filename: Yerdelen_etal_CGD_2023_Machine_learning_derived_correlations_for_scale_up.pdf
Final Published Version
License: Creative Commons Attribution 4.0 logo

Download (3MB)| Preview


Scaling up and technology transfer of crystallization processes has been and continues to be a challenge. This is often due to the stochastic nature of primary nucleation, various scale dependencies of nucleation mechanisms, and the multitude of scale-up approaches. To better understand these dependencies, a series of isothermal induction time studies were performed across a range of vessel volumes, impeller types, and impeller speeds. From these measurements, nucleation rate and growth time were estimated as parameters of an induction time distribution model. Then using machine learning techniques, correlations between the vessel hydrodynamic features, calculated from CFD simulations, and nucleation kinetic parameters were analyzed. Of the 18 machine learning models trained, two models for nucleation rate were found to have the best performance (in terms of % of predictions within experimental variance): a non-linear Random Forest model and a non-linear Gradient Boosting model. For growth time, a non-linear Gradient Boosting model was found to outperform the other models tested. These models were then ensembled to directly predict the probability of nucleation, at a given time, solely from hydrodynamic features with an overall RMSE of 0.16. This work shows how machine learning approaches can be used to analyze limited datasets of induction times to provide insights into what hydrodynamic parameters should be considered in the scale-up of an unseeded crystallization process.


Yerdelen, Stephanie ORCID logoORCID:, Yang, Yihui, Quon, Justin L., Papageorgiou, Charles D., Mitchell, Chris, Houson, Ian, Sefcik, Jan ORCID logoORCID:, ter Horst, Joop H. ORCID logoORCID:, Florence, Alastair J ORCID logoORCID: and Brown, Cameron J. ORCID logoORCID:;