Repository: Freie Universität Berlin, Math Department

Vaquita-LR: A new bioinformatics tool for identifying structural variants using long and short reads. In: Abstracts from the 53rd European Society of Human Genetics (ESHG) Conference: Interactive e-Posters

Kim, J.J. and Kim, J. and Reinert, K. (2020) Vaquita-LR: A new bioinformatics tool for identifying structural variants using long and short reads. In: Abstracts from the 53rd European Society of Human Genetics (ESHG) Conference: Interactive e-Posters. In: 53rd European Society of Human Genetics (ESHG) Conference, June 6–9, 2020, virtual.

Full text not available from this repository.

Official URL: https://doi.org/10.1038/s41431-020-00739-z

Abstract

Introduction: The identification of structural variation in the genome is difficult due to the lack of a “perfect” sequencing technology. By combining short and long read sequencing data, Vaquita-LR is a novel bioinformatics approach to identify more variants with higher accuracy by utilizing the strengths of one method to overcome the weaknesses of the other. Materials and Methods: Vaquita-LR is an extension of Vaquita, a short read variant caller, which has been modified to identify potential variants in long reads in addition to short reads and merge them to provide a final set of variants. Vaquita-LR takes into consideration the read depth, read length, and other characteristics of the sequencing data to add weight to more confident calls. Additionally, it adapts techniques from Pilon to improve the accuracy of the long read data using relevant short read information. Results: By combining short and long reads, Vaquita-LR is able to outperform other structural variant callers, including meta callers which combine results from other callers. Importantly, supplying long reads is effective at sequencing depths as shallow as 1x. Using this combination allows Vaquita-LR to better filter out false positives while retaining true positives, providing a better list of possible causal variants for further investigation. Conclusions: With the abundance of data available today, it is important to consider how the data can be effectively merged to better serve our needs. Vaquita-LR is an initial step in showing the usefulness of integrating different sequencing data types when identifying structural variants.

Item Type:Conference or Workshop Item (Poster)
Subjects:Mathematical and Computer Sciences > Computer Science
Divisions:Department of Mathematics and Computer Science > Institute of Computer Science > Algorithmic Bioinformatics Group
ID Code:2511
Deposited By: Anja Kasseckert
Deposited On:18 Mar 2021 12:07
Last Modified:18 Mar 2021 12:32

Repository Staff Only: item control page