Practical dynamic de Bruijn graphs

Bioinformatics. 2018 Dec 15;34(24):4189-4195. doi: 10.1093/bioinformatics/bty500.

Abstract

Motivation: The de Bruijn graph is fundamental to the analysis of next generation sequencing data and so, as datasets of DNA reads grow rapidly, it becomes more important to represent de Bruijn graphs compactly while still supporting fast assembly. Previous implementations of compact de Bruijn graphs have not supported node or edge deletion, however, which is important for pruning spurious elements from the graph.

Results: Belazzougui et al. (2016b) recently proposed a compact and fully dynamic representation, which supports exact membership queries and insertions and deletions of both nodes and edges. In this paper, we give a practical implementation of their data structure, supporting exact membership queries and fully dynamic edge operations, as well as limited support for dynamic node operations. We demonstrate experimentally that its performance is comparable to that of state-of-the-art implementations based on Bloom filters.

Availability and implementation: Our source-code is publicly available at https://github.com/csirac/dynamicDBG under an open-source license.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Algorithms*
  • Computational Biology
  • High-Throughput Nucleotide Sequencing / methods*
  • Sequence Analysis, DNA / methods*
  • Sequence Deletion
  • Software*