Repository: Freie Universität Berlin, Math Department

The duplication-loss small phylogeny problem: from cherries to trees.

Andreotti, S. and Reinert, K. and Canzar, S. (2013) The duplication-loss small phylogeny problem: from cherries to trees. Journal of Computational Biology, 20 (9). pp. 643-59.

Full text not available from this repository.

Official URL:


Abstract The reconstruction of the history of evolutionary genome-wide events among a set of related organisms is of great biological interest since it can help to reveal the genomic basis of phenotypes. The sequencing of whole genomes faciliates the study of gene families that vary in size through duplication and loss events, like transfer RNA. However, a high sequence similarity often does not allow one to distinguish between orthologs and paralogs. Previous methods have addressed this difficulty by taking into account flanking regions of members of a family independently. We go one step further by inferring the order of genes of (a set of) families for ancestral genomes by considering the order of these genes on sequenced genomes. We present a novel branch-and-cut algorithm to solve the two species small phylogeny problem in the evolutionary model of duplications and losses. On average, our implementation, DupLoCut, improves the running time of a recently proposed method in the experiments on six Vibrionaceae lineages by a factor of â<88>¼200. Besides the mere improvement in running time, the efficiency of our approach allows us to extend our model from cherries of a species tree, that is, subtrees with two leaves, to the median of three species setting. Being able to determine the median of three species is of key importance to one of the most common approaches to ancestral reconstruction, and our experiments show that its repeated computation considerably reduces the number of duplications and losses along the tree both on simulated instances comprising 128 leaves and a set of Bacillus genomes. Furthermore, in our simulations we show that a reduction in cost goes hand in hand with an improvement of the predicted ancestral genomes. Finally, we prove that the small phylogeny problem in the duplication-loss model is NP-complete already for two species.

Item Type:Article
Subjects:Biological Sciences
Mathematical and Computer Sciences
Divisions:Department of Mathematics and Computer Science > Institute of Computer Science > Algorithmic Bioinformatics Group
ID Code:1454
Deposited By: AG Alg BioInf
Deposited On:24 Sep 2014 08:36
Last Modified:24 Sep 2014 09:33

Repository Staff Only: item control page