MIBiG 3.0 : a community-driven effort to annotate experimentally validated biosynthetic gene clusters

Terlouw, Barbara R. and Blin, Kai and Navarro-Muñoz, Jorge C. and Avalon, Nicole E. and Chevrette, Marc G. and Egbert, Susan and Lee, Sanghoon and Meijer, David and Recchia, Michael J.J. and Reitz, Zachary L. and van Santen, Jeffrey A. and Selem-Mojica, Nelly and Tørring, Thomas and Zaroubi, Liana and Alanjary, Mohammad and Aleti, Gajender and Aguilar, César and Al-Salihi, Suhad A.A. and Augustijn, Hannah E. and Avelar-Rivas, J. Abraham and Avitia-Domínguez, Luis A. and Barona-Gómez, Francisco and Bernaldo-Agüero, Jordan and Bielinski, Vincent A. and Biermann, Friederike and Booth, Thomas J. and Carrion Bravo, Victor J. and Castelo-Branco, Raquel and Chagas, Fernanda O. and Cruz-Morales, Pablo and Du, Chao and Duncan, Katherine R. and Gavriilidou, Athina and Gayrard, Damien and Gutiérrez-García, Karina and Haslinger, Kristina and Helfrich, Eric J.N. and van der Hooft, Justin J.J. and Jati, Afif P. and Kalkreuter, Edward and Kalyvas, Nikolaos and Kang, Kyo B. and Kautsar, Satria and Kim, Wonyong and Kunjapur, Aditya M. and Li, Yong-Xin and Lin, Geng-Min and Loureiro, Catarina and Louwen, Joris J.R. and Louwen, Nico L.L. and Lund, George and Parra, Jonathan and Philmus, Benjamin and Pourmohsenin, Bita and Pronk, Lotte J.U. and Rego, Adriana and Balaya Rex, Devasahayam Arokia and Robinson, Serina and Rosas-Becerra, L. Rodrigo and Roxborough, Eve T. and Schorn, Michelle A. and Scobie, Darren J. and Singh, Kumar Saurabh and Sokolova, Nika and Tang, Xiaoyu and Udwary, Daniel and Vigneshwari, Aruna and Vind, Kristiina and Vromans, Sophie P.J.M. and Waschulin, Valentin and Williams, Sam E. and Winter, Jaclyn M. and Witte, Thomas E. and Xie, Huali and Yang, Dong and Yu, Jingwei and Zdouc, Mitja and Zhong, Zheng and Collemare, Jérôme and Linington, Roger G. and Weber, Tilmann and Medema, Marnix H. (2023) MIBiG 3.0 : a community-driven effort to annotate experimentally validated biosynthetic gene clusters. Nucleic Acids Research, 51 (D1). D603-D610. gkac1049. ISSN 0305-1048 (https://doi.org/10.1093/nar/gkac1049)

[thumbnail of Terlouw-etal-NAR-2022-MIBiG-3-0-a-community-driven-effort-to-annotate]
Preview
Text. Filename: Terlouw_etal_NAR_2022_MIBiG_3_0_a_community_driven_effort_to_annotate.pdf
Final Published Version
License: Creative Commons Attribution 4.0 logo

Download (2MB)| Preview

Abstract

With an ever-increasing amount of (meta)genomic data being deposited in sequence databases, (meta)genome mining for natural product biosynthetic pathways occupies a critical role in the discovery of novel pharmaceutical drugs, crop protection agents and biomaterials. The genes that encode these pathways are often organised into biosynthetic gene clusters (BGCs). In 2015, we defined the Minimum Information about a Biosynthetic Gene cluster (MIBiG): a standardised data format that describes the minimally required information to uniquely characterise a BGC. We simultaneously constructed an accompanying online database of BGCs, which has since been widely used by the community as a reference dataset for BGCs and was expanded to 2021 entries in 2019 (MIBiG 2.0). Here, we describe MIBiG 3.0, a database update comprising large-scale validation and re-annotation of existing entries and 661 new entries. Particular attention was paid to the annotation of compound structures and biological activities, as well as protein domain selectivities. Together, these new features keep the database up-to-date, and will provide new opportunities for the scientific community to use its freely available data, e.g. for the training of new machine learning models to predict sequence-structure-function relationships for diverse natural products. MIBiG 3.0 is accessible online at https://mibig.secondarymetabolites.org/.