Journaled string tree--a scalable data structure for analyzing thousands of similar genomes on your laptop

Rahn, R. and Weese, D. and Reinert, K. (2014) Journaled string tree--a scalable data structure for analyzing thousands of similar genomes on your laptop. Bioinformatics . ISSN Online ISSN 1460-2059 - Print ISSN 1367-4803

Full text not available from this repository.

Official URL: http://dx.doi.org/10.1093/bioinformatics/btu438

Abstract

Motivation: Next-generation sequencing (NGS) has revolutionized biomedical research in the past decade and led to a continuous stream of developments in bioinformatics, addressing the need for fast and space-efficient solutions for analyzing NGS data. Often researchers need to analyze a set of genomic sequences that stem from closely related species or are indeed individuals of the same species. Hence, the analyzed sequences are similar. For analyses where local changes in the examined sequence induce only local changes in the results, it is obviously desirable to examine identical or similar regions not repeatedly. Results: In this work, we provide a datatype that exploits data parallelism inherent in a set of similar sequences by analyzing shared regions only once. In real-world experiments, we show that algorithms that otherwise would scan each reference sequentially can be speeded up by a factor of 115. Availability: The data structure and associated tools are publicly available at http://www.seqan.de/projects/jst and are part of SeqAn, the C++ template library for sequence analysis. Contact: rene.rahn@fu-berlin.de

Item Type:	Article
Subjects:	Mathematical and Computer Sciences > Computer Science
Divisions:	Department of Mathematics and Computer Science > Institute of Computer Science > Algorithmic Bioinformatics Group
ID Code:	1448
Deposited By:	Anja Kasseckert
Deposited On:	25 Aug 2014 18:46
Last Modified:	25 Aug 2014 18:46

Repository Staff Only: item control page