Repository: Freie Universität Berlin, Math Department

Testing assembly strategies of Francisella tularensis genomes to infer an evolutionary conservation analysis of genomic structures

Neubert, Kerstin and Zuchantke, Eric and Leidenfrost, Robert Maximilian and Wünschiers, Röbbe and Grützke, Josephine and Malorny, Burkhard and Brendebach, Holger and Al Dahouk, Sascha and Homeier, Timo and Hotzel, Helmut and Reinert, Knut and Tomaso, Herbert and Busch, Anne (2021) Testing assembly strategies of Francisella tularensis genomes to infer an evolutionary conservation analysis of genomic structures. BMC Genomics, 22 (1). p. 822. ISSN 1471-2164

Full text not available from this repository.

Official URL: https://doi.org/10.1186/s12864-021-08115-x

Abstract

Background We benchmarked sequencing technology and assembly strategies for short-read, long-read, and hybrid assemblers in respect to correctness, contiguity, and completeness of assemblies in genomes of Francisella tularensis. Benchmarking allowed in-depth analyses of genomic structures of the Francisella pathogenicity islands and insertion sequences. Five major high-throughput sequencing technologies were applied, including next-generation “short-read” and third-generation “long-read” sequencing methods. Results We focused on short-read assemblers, hybrid assemblers, and analysis of the genomic structure with particular emphasis on insertion sequences and the Francisella pathogenicity island. The A5-miseq pipeline performed best for MiSeq data, Mira for Ion Torrent data, and ABySS for HiSeq data from eight short-read assembly methods. Two approaches were applied to benchmark long-read and hybrid assembly strategies: long-read-first assembly followed by correction with short reads (Canu/Pilon, Flye/Pilon) and short-read-first assembly along with scaffolding based on long reads (Unicyler, SPAdes). Hybrid assembly can resolve large repetitive regions best with a “long-read first” approach. Conclusions Genomic structures of the Francisella pathogenicity islands frequently showed misassembly. Insertion sequences (IS) could be used to perform an evolutionary conservation analysis. A phylogenetic structure of insertion sequences and the evolution within the clades elucidated the clade structure of the highly conservative F. tularensis.

Item Type:Article
Subjects:Mathematical and Computer Sciences > Computer Science
Divisions:Department of Mathematics and Computer Science > Institute of Computer Science > Algorithmic Bioinformatics Group
ID Code:2696
Deposited By: Anja Kasseckert
Deposited On:02 Feb 2022 13:39
Last Modified:02 Feb 2022 13:44

Repository Staff Only: item control page