The Natural Products Atlas : an open access knowledge base for microbial natural products discovery

van Santen, Jeffrey A. and Jacob, Grégoire and Singh, Amrit Leen and Aniebok, Victor and Balunas, Marcy J. and Bunsko, Derek and Carnevale Neto, Fausto and Castanño-Espriu, Laia and Chang, Chen and Clark, Trevor N. and Cleary Little, Jessica L. and Delgadillo, David A. and Dorrestein, Pieter C. and Duncan, Katherine R. and Egan, Joseph M. and Galey, Melissa M. and Haeckl, F.P. Jake and Hua, Alex and Hughes, Alison H. and Iskakova, Dasha and Khadilkar, Aswad and Lee, Jung-Ho and Lee, Sanghoon and LeGrow, Nicole and Liu, Dennis Y. and Macho, Jocelyn M. and McCaughey, Catherine S. and Medema, Marnix H. and Neupane, Ram P. and O’Donnell, Timothy J. and Paula, Jasmine S. and Sanchez, Laura M. and Shaikh, Anam F. and Soldatou, Sylvia and Terlouw, Barbara R. and Tran, Tuan Anh and Valentine, Mercia and van der Hooft, Justin J. J. and Vo, Duy A. and Wang, Mingxun and Wilson, Darryl and Zink, Katherine E. and Linington, Roger G. (2019) The Natural Products Atlas : an open access knowledge base for microbial natural products discovery. ACS Central Science, 5 (11). pp. 1824-1833. ISSN 2374-7951 (https://doi.org/10.1021/acscentsci.9b00806)

[thumbnail of van-Santen-etal-ACS-CS-2019-an-open-access-knowledge-base-for-microbial-natural-products-discovery]
Preview
Text. Filename: van_Santen_etal_ACS_CS_2019_an_open_access_knowledge_base_for_microbial_natural_products_discovery.pdf
Final Published Version

Download (3MB)| Preview

Abstract

Despite rapid evolution in the area of microbial natural products chemistry, there is currently no open access database containing all microbially produced natural product structures. Lack of availability of these data is preventing the implementation of new technologies in natural products science. Specifically, development of new computational strategies for compound characterization and identification are being hampered by the lack of a comprehensive database of known compounds against which to compare experimental data. The creation of an open access, community-maintained database of microbial natural product structures would enable the development of new technologies in natural products discovery and improve the interoperability of existing natural products data resources. However, these data are spread unevenly throughout the historical scientific literature, including both journal articles and international patents. These documents have no standard format, are often not digitized as machine readable text, and are not publicly available. Further, none of these documents have associated structure files (e.g., MOL, InChI, or SMILES), instead containing images of structures. This makes extraction and formatting of relevant natural products data a formidable challenge. Using a combination of manual curation and automated data mining approaches we have created a database of microbial natural products (The Natural Products Atlas, www.npatlas.org) that includes 24 594 compounds and contains referenced data for structure, compound names, source organisms, isolation references, total syntheses, and instances of structural reassignment. This database is accompanied by an interactive web portal that permits searching by structure, substructure, and physical properties. The Web site also provides mechanisms for visualizing natural products chemical space and dashboards for displaying author and discovery timeline data. These interactive tools offer a powerful knowledge base for natural products discovery with a central interface for structure and property-based searching and presents new viewpoints on structural diversity in natural products. The Natural Products Atlas has been developed under FAIR principles (Findable, Accessible, Interoperable, and Reusable) and is integrated with other emerging natural product databases, including the Minimum Information About a Biosynthetic Gene Cluster (MIBiG) repository, and the Global Natural Products Social Molecular Networking (GNPS) platform. It is designed as a community-supported resource to provide a central repository for known natural product structures from microorganisms and is the first comprehensive, open access resource of this type. It is expected that the Natural Products Atlas will enable the development of new natural products discovery modalities and accelerate the process of structural characterization for complex natural products libraries.

ORCID iDs

van Santen, Jeffrey A., Jacob, Grégoire, Singh, Amrit Leen, Aniebok, Victor, Balunas, Marcy J., Bunsko, Derek, Carnevale Neto, Fausto, Castanño-Espriu, Laia, Chang, Chen, Clark, Trevor N., Cleary Little, Jessica L., Delgadillo, David A., Dorrestein, Pieter C., Duncan, Katherine R. ORCID logoORCID: https://orcid.org/0000-0002-3670-4849, Egan, Joseph M., Galey, Melissa M., Haeckl, F.P. Jake, Hua, Alex, Hughes, Alison H., Iskakova, Dasha, Khadilkar, Aswad, Lee, Jung-Ho, Lee, Sanghoon, LeGrow, Nicole, Liu, Dennis Y., Macho, Jocelyn M., McCaughey, Catherine S., Medema, Marnix H., Neupane, Ram P., O’Donnell, Timothy J., Paula, Jasmine S., Sanchez, Laura M., Shaikh, Anam F., Soldatou, Sylvia, Terlouw, Barbara R., Tran, Tuan Anh, Valentine, Mercia, van der Hooft, Justin J. J., Vo, Duy A., Wang, Mingxun, Wilson, Darryl, Zink, Katherine E. and Linington, Roger G.;