AMYCNE: Confident copy number assessment using whole genome sequencing data

PLoS One. 2018 Mar 26;13(3):e0189710. doi: 10.1371/journal.pone.0189710. eCollection 2018.

Abstract

Copy number variations (CNVs) within the human genome have been linked to a diversity of inherited diseases and phenotypic traits. The currently used methodology to measure copy numbers has limited resolution and/or precision, especially for regions with more than 4 copies. Whole genome sequencing (WGS) offers an alternative data source to allow for the detection and characterization of the copy number across different genomic regions in a single experiment. A plethora of tools have been developed to utilize WGS data for CNV detection. None of these tools are designed specifically to accurately estimate copy numbers of complex regions in a small cohort or clinical setting. Herein, we present AMYCNE (automatic modeling functionality for copy number estimation), a CNV analysis tool using WGS data. AMYCNE is multifunctional and performs copy number estimation of complex regions, annotation of VCF files, and CNV detection on individual samples. The performance of AMYCNE was evaluated using AMY1A ddPCR measurements from 86 unrelated individuals. In addition, we validated the accuracy of AMYCNE copy number predictions on two additional genes (FCGR3A and FCGR3B) using datasets available through the 1000 genomes consortium. Finally, we simulated levels of mosaic loss and gain of chromosome X and used this dataset for benchmarking AMYCNE. The results show a high concordance between AMYCNE and ddPCR, validating the use of AMYCNE to measure tandem AMY1 repeats with high accuracy. This opens up new possibilities for the use of WGS for accurate copy number determination of other complex regions in the genome in small cohorts or single individuals.

Publication types

  • Evaluation Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Chromosomes, Human, X
  • DNA Copy Number Variations*
  • GPI-Linked Proteins / genetics
  • Genetic Loci
  • Humans
  • Pattern Recognition, Automated
  • Receptors, IgG / genetics
  • Salivary alpha-Amylases / genetics
  • Whole Genome Sequencing / methods*

Substances

  • FCGR3A protein, human
  • FCGR3B protein, human
  • GPI-Linked Proteins
  • Receptors, IgG
  • AMY1A protein, human
  • Salivary alpha-Amylases

Grants and funding

This work was supported by the Swedish Research Council [2012-1526 to AL]; the Marianne and Marcus Wallenberg foundation [2014.0084 to AL]; the Swedish Society for Medical Research [S14-0210 to AL]; the Stockholm City Council [to AL]; the Harald and Greta Jeanssons Foundation [to AL]; the Ulf Lundahl memory fund through the Swedish Brain Foundation [to AL]; the Nilsson Ehle donations [to AL]; the IngaBritt and Arne Lundbergs Forskningsstiftelse [to JAA]; and the Erik Rönnberg Foundation [to AL]. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.