Different evolutionary trends form the twilight zone of the bacterial pan-genome

Horesh, Gal and Taylor-Brown, Alyce and McGimpsey, Stephanie and Lassalle, Florent and Corander, Jukka and Heinz, Eva and Thomson, Nicholas R. (2021) Different evolutionary trends form the twilight zone of the bacterial pan-genome. Microbial Genomics, 7 (9). 000670. ISSN 2057-5858 (https://doi.org/10.1099/MGEN.0.000670)

[thumbnail of Horesh-etal-2021-Different-evolutionary-trends-form-the-twilight-zone-of-the-bacterial-pan-genome]
Preview
Text. Filename: Horesh-etal-2021-Different-evolutionary-trends-form-the-twilight-zone-of-the-bacterial-pan-genome.pdf
Final Published Version
License: Creative Commons Attribution 4.0 logo

Download (2MB)| Preview

Abstract

The pan-genome is defined as the combined set of all genes in the gene pool of a species. Pan-genome analyses have been very useful in helping to understand different evolutionary dynamics of bacterial species: an open pan-genome often indicates a free-living lifestyle with metabolic versatility, while closed pan-genomes are linked to host-restricted, ecologically specialized bacteria. A detailed understanding of the species pan-genome has also been instrumental in tracking the phylodynamics of emerging drug resistance mechanisms and drug-resistant pathogens. However, current approaches to analyse a species’ pan-genome do not take the species population structure into account, nor do they account for the uneven sampling of different lineages, as is commonplace due to over-sampling of clinically relevant representatives. Here we present the application of a population structure-aware approach for classify-ing genes in a pan-genome based on within-species distribution. We demonstrate our approach on a collection of 7500 Escherichia coli genomes, one of the most-studied bacterial species and used as a model for an open pan-genome. We reveal clearly distinct groups of genes, clustered by different underlying evolutionary dynamics, and provide a more biologically informed and accurate description of the species’ pan-genome.

ORCID iDs

Horesh, Gal, Taylor-Brown, Alyce, McGimpsey, Stephanie, Lassalle, Florent, Corander, Jukka, Heinz, Eva ORCID logoORCID: https://orcid.org/0000-0003-4413-3756 and Thomson, Nicholas R.;