Addressing the shortcomings of three recent bayesian methods for detecting interspecific recombination in DNA sequence alignments

Husmeier, Dirk and Mantzaris, Alexander Vassilios (2008) Addressing the shortcomings of three recent bayesian methods for detecting interspecific recombination in DNA sequence alignments. Statistical Applications in Genetics and Molecular Biology, 7 (1). (https://doi.org/10.2202/1544-6115.1399)

[thumbnail of SAGMB08preprint.pdf]
Preview
PDF. Filename: SAGMB08preprint.pdf
Preprint

Download (448kB)| Preview

Abstract

We address a potential shortcoming of three probabilistic models for detecting interspecific recombination in DNA sequence alignments: the multiple change-point model (MCP) of Suchard et al. (2003), the dual multiple change-point model (DMCP) of Minin et al. (2005), and the phylogenetic factorial hidden Markov model (PFHMM) of Husmeier (2005). These models are based on the Bayesian paradigm, which requires the solution of an integral over the space of branch lengths. To render this integration analytically tractable, all three models make the same assumption that the vectors of branch lengths of the phylogenetic tree are independent among sites. While this approximation reduces the computational complexity considerably, we show that it leads to the systematic prediction of spurious topology changes in the Felsenstein zone, that is, the area in the branch lengths configuration space where maximum parsimony consistently infers the wrong topology due to long-branch attraction. We apply two Bayesian hypothesis tests, based on an inter- and an intra-model approach to estimating the marginal likelihood. We then propose a revised model that addresses these shortcomings, and compare it with the aforementioned models on a set of synthetic DNA sequence alignments systematically generated around the Felsenstein zone.