GreenHill: a de novo chromosome-level scaffolding and phasing tool using Hi-C

Genome Biol. 2023 Jul 11;24(1):162. doi: 10.1186/s13059-023-03006-8.

Abstract

Chromosome-level haplotype-resolved genome assembly is an important resource in molecular biology. However, current de novo haplotype assemblers require parental data or reference genomes and often fail to provide chromosome-level results. We present GreenHill, a novel scaffolding and phasing tool that considers various assemblers' contigs as input to reconstruct chromosome-level haplotypes using Hi-C without parental or reference data. Its unique functions include new error correction based on Hi-C contacts and the simultaneous use of Hi-C and long reads. Benchmarks reveal that GreenHill outperforms other approaches in contiguity and phasing accuracy, and the majority of chromosome arms are entirely phased.

Keywords: Genome assembly; Haplotype; Hi-C; Phasing; Scaffolding.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Benchmarking
  • Haplotypes
  • Tool Use Behavior*