OpenCrystalData : an open-access particle image database to facilitate learning, experimentation, and development of image analysis models for crystallization processes
Barhate, Yash and Boyle, Christopher and Salami, Hossein and Wu, Wei-Lee and Taherimakhsousi, Nina and Rabinowitz, Charlie and Bommarius, Andreas and Cardona, Javier and Nagy, Zoltan K. and Rousseau, Ronald and Grover, Martha (2024) OpenCrystalData : an open-access particle image database to facilitate learning, experimentation, and development of image analysis models for crystallization processes. Digital Chemical Engineering, 11. 100150. ISSN 2772-5081 (https://doi.org/10.1016/j.dche.2024.100150)
Preview |
Text.
Filename: Barhate-etal-DCE-2024-OpenCrystalData-an-open-access-particle-image-database.pdf
Final Published Version License: Download (2MB)| Preview |
Abstract
Imaging and image-based process analytical technologies (PAT) have revolutionized the design, development, and operation of crystallization processes, providing greater process understanding through the characterization of particle size, shape and crystallization mechanisms in real-time. The performance of corresponding PAT models, including machine learning/artificial intelligence (ML/AI)-based approaches, is highly reliant on the data quality used for training or validation. However, acquiring high quality data is often time consuming and a major roadblock in developing image analysis models for crystallization processes. To address the lack of diverse, high-quality, and publicly available particle image datasets, this paper presents an initiative to create an open-access crystallization-related image database: OpenCrystalData (OCD, at www.kaggle.com/opencrystaldata/datasets). The datasets consist of images from different crystallization systems with different particle sizes and shapes captured under various conditions. The initial release consists of four different datasets, addressing the estimation of particle size distribution using in-situ images for different categories of particles and detection of anomalous particles for process monitoring purposes. Images are collected using various instruments, followed by case-specific processing steps, such as ground-truth labeling and particle size characterization using offline microscopy. Datasets are released on the online collaborative platform Kaggle, along with specific guidelines for each dataset. These datasets are aimed to serve as a resource for researchers to enable learning, experimentation, development, and evaluation and comparison of different analytical approaches and algorithms. Another goal of this initiative is to encourage researchers to contribute new datasets focusing on various systems and problem statements. Ultimately, OpenCrystalData is intended to facilitate and inspire new developments in imaging-based PAT for crystallization processes, encouraging a shift from time-consuming offline analysis towards comprehensive real-time process insights that drive product quality.
ORCID iDs
Barhate, Yash, Boyle, Christopher ORCID: https://orcid.org/0000-0001-8926-6590, Salami, Hossein, Wu, Wei-Lee, Taherimakhsousi, Nina, Rabinowitz, Charlie, Bommarius, Andreas, Cardona, Javier ORCID: https://orcid.org/0000-0002-9284-1899, Nagy, Zoltan K., Rousseau, Ronald and Grover, Martha;-
-
Item type: Article ID code: 88658 Dates: DateEvent30 June 2024Published9 April 2024Published Online30 March 2024AcceptedSubjects: Science > Chemistry > Crystallography
Science > ChemistryDepartment: Faculty of Engineering > Chemical and Process Engineering
Faculty of Engineering > Electronic and Electrical EngineeringDepositing user: Pure Administrator Date deposited: 11 Apr 2024 14:28 Last modified: 11 Nov 2024 14:15 URI: https://strathprints.strath.ac.uk/id/eprint/88658