TSD: A Computational Tool To Study the Complex Structural Variants Using PacBio Targeted Sequencing Data

G3 (Bethesda). 2019 May 7;9(5):1371-1376. doi: 10.1534/g3.118.200900.

Abstract

PacBio sequencing is a powerful approach to study DNA or RNA sequences in a longer scope. It is especially useful in exploring the complex structural variants generated by random integration or multiple rearrangement of endogenous or exogenous sequences. Here, we present a tool, TSD, for complex structural variant discovery using PacBio targeted sequencing data. It allows researchers to identify and visualize the genomic structures of targeted sequences by unlimited splitting, alignment and assembly of long PacBio reads. Application to the sequencing data derived from an HBV integrated human cell line(PLC/PRF/5) indicated that TSD could recover the full profile of HBV integration events, especially for the regions with the complex human-HBV genome integrations and multiple HBV rearrangements. Compared to other long read analysis tools, TSD showed a better performance for detecting complex genomic structural variants. TSD is publicly available at: https://github.com/menggf/tsd.

Keywords: structural variants long reads genomic structure PacBio.

MeSH terms

  • Algorithms
  • Cell Line
  • Computational Biology / methods*
  • Genetic Variation
  • Genomics / methods*
  • Hepatitis B virus / genetics
  • High-Throughput Nucleotide Sequencing
  • Humans
  • Sequence Analysis, DNA
  • Software*
  • Virus Integration