Repository: Freie Universität Berlin, Math Department

Detecting genomic indel variants with exact breakpoints in single- and paired-end sequencing data using SplazerS

Emde, A.-K. and Schulz, M. H. and Weese, D. and Sun, R. and Vingron, M. and Kalscheuer, V. M. and Haas, S. A. and Reinert, K. (2012) Detecting genomic indel variants with exact breakpoints in single- and paired-end sequencing data using SplazerS. Bioinformatics, 28 (5). pp. 619-627.

[img]
Preview
PDF (Original Paper)
575kB

Official URL: http://bioinformatics.oxfordjournals.org/content/2...

Abstract

Motivation: The reliable detection of genomic variation in resequencing data is still a major challenge, especially for variants larger than a few base pairs. Sequencing reads crossing boundaries of structural variation carry the potential for their identification, but are difficult to map. Results: Here we present a method for ‘split’ read mapping, where prefix and suffix match of a read may be interrupted by a longer gap in the read-to-reference alignment. We use this method to accurately detect medium-sized insertions and long deletions with precise breakpoints in genomic resequencing data. Compared with alternative split mapping methods, SplazerS significantly improves sensitivity for detecting large indel events, especially in variant-rich regions. Our method is robust in the presence of sequencing errors as well as alignment errors due to genomic mutations/divergence, and can be used on reads of variable lengths. Our analysis shows that SplazerS is a versatile tool applicable to unanchored or single-end as well as anchored paired-end reads. In addition, application of SplazerS to targeted resequencing data led to the interesting discovery of a complete, possibly functional gene retrocopy variant. Availability: SplazerS is available from http://www.seqan.de/projects/ splazers.

Item Type:Article
Subjects:Biological Sciences
Mathematical and Computer Sciences
Divisions:Department of Mathematics and Computer Science > Institute of Computer Science > Algorithmic Bioinformatics Group
ID Code:1160
Deposited By: AG Alg BioInf
Deposited On:12 Sep 2012 08:40
Last Modified:03 Mar 2017 14:41

Repository Staff Only: item control page