DetectIS: a pipeline to rapidly detect exogenous DNA integration sites using DNA or RNA paired-end sequencing data

Bioinformatics. 2021 Nov 18;37(22):4230-4232. doi: 10.1093/bioinformatics/btab366.

Abstract

Motivation: Recombinant DNA technology is widely used for different applications in biology, medicine and bio-technology. Viral transduction and plasmid transfection are among the most frequently used techniques to generate recombinant cell lines. Many of these methods result in the random integration of the plasmid into the host genome. Rapid identification of the integration sites is highly desirable in order to characterize these engineered cell lines.

Results: We developed detectIS: a pipeline specifically designed to identify genomic integration sites of exogenous DNA, either a plasmid containing one or more transgenes or a virus. The pipeline is based on a Nextflow workflow combined with a Singularity image containing all the necessary software, ensuring high reproducibility and scalability of the analysis. We tested it on simulated datasets and RNA-seq data from a human sample infected with Hepatitis B virus. Comparisons with other state of the art tools show that our method can identify the integration site in different recombinant cell lines, with accurate results, lower computational demand and shorter execution times.

Availability and implementation: The Nextflow workflow, the Singularity image and a test dataset are available at https://github.com/AstraZeneca/detectIS.

Supplementary information: Supplementary data are available at Bioinformatics online.

MeSH terms

  • DNA
  • Genomics
  • Humans
  • RNA*
  • Reproducibility of Results
  • Software*

Substances

  • RNA
  • DNA