Automatic extraction of pharmaceutical manufacturing data from patents using Natural Language Processing (NLP)

Alvarado, D. and Johnston, B. and Brown, C. (2022) Automatic extraction of pharmaceutical manufacturing data from patents using Natural Language Processing (NLP). In: CMAC Annual Open Day 2022, 2022-05-16 - 2022-05-18.

[thumbnail of Alvarado-etal-CMAC-2022-Automatic-extraction-of-pharmaceutical-manufacturing-data-from-patents]
Preview
Text. Filename: Alvarado_etal_CMAC_2022_Automatic_extraction_of_pharmaceutical_manufacturing_data_from_patents.pdf
Final Published Version
License: Strathprints license 1.0

Download (1MB)| Preview

Abstract

Introduction • Deep generative models (DGM) are models capable of generating realistic samples and learning hidden information • DGM used in drug discovery to generate new molecular entities with desirable biological and chemical properties • Applications in pharmaceutical manufacturing have not been fully explored • Potential Benefits of DGM - Aid process design by generating a feasible chain of unit operations for the production of an API/dosage forms - Improve process understanding through the utilisation of latent variables that may be correlated to process parameters. • Thousands of data are required to develop a model • No database that consolidates this information available in literature to be used in DGM for primary or secondary manufacturing domain

ORCID iDs

Alvarado, D., Johnston, B. ORCID logoORCID: https://orcid.org/0000-0001-9785-6822 and Brown, C. ORCID logoORCID: https://orcid.org/0000-0001-7091-1721;

Persistent Identifier

https://doi.org/10.17868/strath.00081748