Simultaneous identification and prioritization of variants in familial, de novo, and somatic genetic disorders with VariantMaster

Genome Res. 2014 Feb;24(2):349-55. doi: 10.1101/gr.163832.113. Epub 2014 Jan 3.

Abstract

There is increasing interest in clinical genetics pertaining to the utilization of high-throughput sequencing data for accurate diagnoses of monogenic diseases. Moreover, massive whole-exome sequencing of tumors has provided significant advances in the understanding of cancer development through the recognition of somatic driver variants. To improve the identification of the variants from HTS, we developed VariantMaster, an original program that accurately and efficiently extracts causative variants in familial and sporadic genetic diseases. The algorithm takes into account predicted variants (SNPs and indels) in affected individuals or tumor samples and utilizes the row (BAM) data to robustly estimate the conditional probability of segregation in a family, as well as the probability of it being de novo or somatic. In familial cases, various modes of inheritance are considered: X-linked, autosomal dominant, and recessive (homozygosity or compound heterozygosity). Moreover, VariantMaster integrates phenotypes and genotypes, and employs Annovar to produce additional information such as allelic frequencies in the general population and damaging scores to further reduce the number of putative variants. As a proof of concept, we successfully applied VariantMaster to identify (1) de novo mutations in a previously described data set, (2) causative variants in a rare Mendelian genetic disease, and (3) known and new "driver" mutations in previously reported cancer data sets. Our results demonstrate that VariantMaster is considerably more accurate in terms of precision and sensitivity compared with previously published algorithms.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Computational Biology / methods
  • Databases, Genetic
  • Gene Frequency
  • Genetic Predisposition to Disease*
  • Genotype
  • Humans
  • INDEL Mutation / genetics*
  • Phenotype
  • Polymorphism, Single Nucleotide / genetics*
  • Sequence Analysis, DNA
  • Software*