Genome-wide association studies (GWAS) and whole-exome sequencing (WES) generate massive amounts of genomic variant information, and a major challenge is to identify which variations drive disease or contribute to phenotypic traits. Because the majority of known disease-causing mutations are exonic non-synonymous single nucleotide variations (nsSNVs), most studies focus on whether these nsSNVs affect protein function. Computational studies show that the impact of nsSNVs on protein function reflects sequence homology and structural information and predict the impact through statistical methods, machine learning techniques, or models of protein evolution. Here, we review impact prediction methods and discuss their underlying principles, their advantages and limitations, and how they compare to and complement one another. Finally, we present current applications and future directions for these methods in biological research and medical genetics.
Keywords: disease causing SNV (single nucleotide variation); functional impact prediction methods; missense variant classification; non-synonymous protein mutations; single nucleotide polymorphism prioritization.
© 2014 The Authors Protein Science published by Wiley Periodicals, Inc. on behalf of The Protein Society.