LINgroups as a robust principled approach to compare and integrate multiple bacterial taxonomies

Mazloom, Reza and Pierce-Ward, N. Tessa and Sharma, Parul and Pritchard, Leighton and Brown, C. Titus and Vinatzer, Boris A. and Heath, Lenwood S. (2024) LINgroups as a robust principled approach to compare and integrate multiple bacterial taxonomies. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 21 (6). pp. 2304-2314. ISSN 1545-5963 (https://doi.org/10.1109/TCBB.2024.3475917)

[thumbnail of Mazloom-etal-IEEE-ACM-TCBB-2024-LINgroups-as-a-principled-approach-to-compare-and-integrate]
Preview
Text. Filename: Mazloom-etal-IEEE-ACM-TCBB-2024-LINgroups-as-a-principled-approach-to-compare-and-integrate.pdf
Accepted Author Manuscript
License: Creative Commons Attribution 4.0 logo

Download (10MB)| Preview

Abstract

As a central organizing principle of biology, bacteria and archaea are classified into a hierarchical structure across taxonomic ranks from kingdom to subspecies. Traditionally, this organization was based on observable characteristics of form and chemistry but recently, bacterial taxonomy has been robustly quantified using comparisons of sequenced genomes, as exemplified in the Genome Taxonomy Database (GTDB). Such genome-based taxonomies resolve genomes down to genera and species and are useful in many contexts yet lack the flexibility and resolution of a fine-grained approach. The Life Identification Number (LIN) approach is a common, quantitative framework to tie existing (and future) bacterial taxonomies together, increase the resolution of genome-based discrimination of taxa, and extend taxonomic identification below the species level in a principled way. Utilizing LINgroup as an organizational concept helps resolve some of the confusion and unforeseen negative effects resulting from nomenclature changes of microorganisms that are closely related by overall genomic similarity (often due to genome-based reclassification). Our experimental results demonstrate the value of LINs and LINgroups in mapping between taxonomies, translating between different nomenclatures, and integrating them into a single taxonomic framework. They also reveal the robustness of LIN assignment to hyper-parameter changes when considering within-species taxonomic groups.

ORCID iDs

Mazloom, Reza, Pierce-Ward, N. Tessa, Sharma, Parul, Pritchard, Leighton ORCID logoORCID: https://orcid.org/0000-0002-8392-2822, Brown, C. Titus, Vinatzer, Boris A. and Heath, Lenwood S.;