Bioinformatics Tools for Ultra-sensitive Sequencing Data Using Unique Molecular Identifiers

Bioinformatics Tools for Ultra-sensitive Sequencing Data Using Unique Molecular Identifiers

Tobias Österlund
University of Gothenburg, Sahlgrenska University Hospital, Sweden

Abstract
Targeted sequencing using Unique Molecular Identifiers (UMIs) enables detection and quantification of rare variant alleles in challenging applications, such as cell-free DNA analysis from liquid biopsies. Standard bioinformatics pipelines for data processing and variant calling are not adapted for deep-sequencing data containing UMIs and are inflexible, require multi-step workflows or dedicated computing resources. Here, we developed UMIErrorCorrect, a bioinformatics pipeline for analyzing sequencing data containing UMIs. UMIErrorCorrect only requires fastq files as inputs and performs alignment, UMI clustering, error correction and variant calling. We also provide UMIAnalyzer, a graphical user interface, for data mining, visualization, variant interpretation and report generation. UMIAnalyzer allows the user to adjust analysis parameters and study their effect on variant calling. We demonstrated the flexibility of UMIErrorCorrect by analyzing data from four different targeted sequencing protocols and accurately quantified rare variants in standardized cell-free DNA reference material. UMIErrorCorrect outperformed existing pipelines developed for targeted UMI sequencing data in terms of variant detection sensitivity. UMIErrorCorrect and UMIAnalyzer are comprehensive and customizable bioinformatics tools that can be applied to any type of library preparation protocol and enrichment chemistry using UMIs. Access to simple, generic and open-source bioinformatics tools will facilitate the use and implementation of UMI-based sequencing approaches in basic research and clinical applications.

Back to GQ2023 session page Back to GQ2023 overview page
Bookmark the permalink.

Comments are closed.