Deep generative models for pharmaceutical manufacturing process design

Alvarado, D. and Johnston, B.F. and Brown, C.J. (2026) Deep generative models for pharmaceutical manufacturing process design. Chemical Engineering Research and Design, 230. pp. 571-585. ISSN 0263-8762 (https://doi.org/10.1016/j.cherd.2026.05.001)

[thumbnail of Alvarado-etal-CERD-2026-Deep-generative-models-for-pharmaceutical-manufacturing-process-design]
Preview
Text. Filename: Alvarado-etal-CERD-2026-Deep-generative-models-for-pharmaceutical-manufacturing-process-design.pdf
Final Published Version
License: Creative Commons Attribution-NonCommercial 4.0 logo

Download (5MB)| Preview

Abstract

Designing pharmaceutical manufacturing processes is a complex task that often relies on expert-driven heuristics and iterative experimentation. While computational tools have advanced conditions optimisation and material selection, the methods for guiding the choice and sequencing of manufacturing operations remain scarce. In this study, we explore the use of deep generative models to address this gap by learning to generate plausible sequences of operations for primary pharmaceutical manufacturing. To enable model training, a large-scale dataset with approximately 385 K manufacturing procedures was built from patent literature using natural language processing techniques. We developed and compared several generative architectures, focusing on conditional variational autoencoders. The best-performing models generated manufacturing instructions conditioned on sets of input materials, achieving high reconstruction accuracy and over 70% valid generated outputs. External validation through expert surveys demonstrated that generated sequences were rated as equally plausible as actual procedures in 38% of cases. These results indicate the potential of DGMs to support operation selection and early-stage process design. Nonetheless, limitations in data acquisition methods highlight the need for improved datasets and integration with predictive tools for process validation. This work represents a step forward towards data-driven generative approaches for pharmaceutical manufacturing process design and outlines future directions for enhancing their practical applicability.

ORCID iDs

Alvarado, D. ORCID logoORCID: https://orcid.org/0000-0003-1191-1478, Johnston, B.F. ORCID logoORCID: https://orcid.org/0000-0001-9785-6822 and Brown, C.J. ORCID logoORCID: https://orcid.org/0000-0001-7091-1721;