Phables: from fragmented assemblies to high-quality bacteriophage genomes

Bioinformatics. 2023 Oct 3;39(10):btad586. doi: 10.1093/bioinformatics/btad586.

Abstract

Motivation: Microbial communities have a profound impact on both human health and various environments. Viruses infecting bacteria, known as bacteriophages or phages, play a key role in modulating bacterial communities within environments. High-quality phage genome sequences are essential for advancing our understanding of phage biology, enabling comparative genomics studies and developing phage-based diagnostic tools. Most available viral identification tools consider individual sequences to determine whether they are of viral origin. As a result of challenges in viral assembly, fragmentation of genomes can occur, and existing tools may recover incomplete genome fragments. Therefore, the identification and characterization of novel phage genomes remain a challenge, leading to the need of improved approaches for phage genome recovery.

Results: We introduce Phables, a new computational method to resolve phage genomes from fragmented viral metagenome assemblies. Phables identifies phage-like components in the assembly graph, models each component as a flow network, and uses graph algorithms and flow decomposition techniques to identify genomic paths. Experimental results of viral metagenomic samples obtained from different environments show that Phables recovers on average over 49% more high-quality phage genomes compared to existing viral identification tools. Furthermore, Phables can resolve variant phage genomes with over 99% average nucleotide identity, a distinction that existing tools are unable to make.

Availability and implementation: Phables is available on GitHub at https://github.com/Vini2/phables.

MeSH terms

  • Bacteria / genetics
  • Bacteriophages* / genetics
  • Genome, Viral
  • Genomics
  • Humans
  • Metagenome
  • Metagenomics / methods