Multi-label classification of crystallisation outcomes from in-situ imaging

Sandhu, Parandeep and Boyle, Christopher and Tachtatzis, Christos and Cardona, Javier (2024) Multi-label classification of crystallisation outcomes from in-situ imaging. In: 2024 British Association for Crystal Growth Annual Conference, 2024-07-01 - 2024-07-03.

[thumbnail of Sandhu-etal-BACG-2024-Multi-label-classification-of-crystallisation-outcomes]
Preview
Text. Filename: Sandhu-etal-BACG-2024-Multi-label-classification-of-crystallisation-outcomes.pdf
Final Published Version
License: Strathprints license 1.0

Download (2MB)| Preview

Abstract

In chemical and pharmaceutical manufacturing, data-driven approaches for in-line monitoring of particle attributes are becoming more prevalent to optimise process performance and ensure product quality. These approaches utilise process analytical technologies (PAT) and Artificial Intelligence (AI) to analyse and process data in real-time, identifying patterns and trends that can be used to adjust the manufacturing process. By monitoring key attributes such as particle size, shape, and composition, the potential issues can be detected and addressed before they result in defects in the final product, leading to reduced production costs, increased production yields and improved product quality. In addition, data-driven approaches can provide valuable insights into the root causes of process variability, enabling manufacturers to make targeted improvements and enhance traceability and reproducibility, which are essential for regulatory compliance and maintaining optimal quality control. In this study, we employed a Convolutional Neural Network (CNN) to categorise eleven distinct crystallisation outcomes, including crystal shapes like needles, plates, elongated, and blocks, as well as to detect images where an object is present but lacks a distinguishable shape (labelled as 'object present'). We also aimed to identify overly concentrated solutions and other events that can arise during crystallisation processes, such as unidentified floating objects (UFOs), agglomerated crystals, bubbles, and droplets. We devised a method to estimate the optimal thresholds for each label, employing stratified k-fold cross-validation, which allowed us to evaluate the full dataset. This approach not only enabled us to assess the performance of our model across different subsets of the data but also to fine-tune the threshold values to maximise the F1-score of the model. By iterating over a range of possible threshold values and evaluating their impact on F1- score for each label, we were able to identify the optimal threshold setting for each label. This method ensured that our model's predictions were robust and reliable, offering a balanced trade-off between recall and precision, which is crucial in applications where both false positives and false negatives carry significant consequences. This paper examines the efficacy of a Convolutional Neural Network (CNN) model in integrating Process Analytical Technology (PAT) data with deep learning in manufacturing. The CNN model exhibited robust performance, with macro, micro and weighted-average scores surpassing the 80% mark across precision, recall, and F1-score metrics for majority of labels, underscoring its potential utility in industrial applications. However, the classification performance for certain classes, such as UFOs and bubbles, was notably weaker due to their scarce presence in the dataset—a consequence of the sterile conditions in crystallisation processes where this is kept at a minimal.

ORCID iDs

Sandhu, Parandeep, Boyle, Christopher ORCID logoORCID: https://orcid.org/0000-0001-8926-6590, Tachtatzis, Christos ORCID logoORCID: https://orcid.org/0000-0001-9150-6805 and Cardona, Javier ORCID logoORCID: https://orcid.org/0000-0002-9284-1899;