A power-based sliding window approach to evaluate the clinical impact of rare genetic variants in the nucleotide sequence or the spatial position of the folded protein

HGG Adv. 2024 Jul 18;5(3):100284. doi: 10.1016/j.xhgg.2024.100284. Epub 2024 Mar 19.

Abstract

Systematic determination of novel variant pathogenicity remains a major challenge, even when there is an established association between a gene and phenotype. Here we present Power Window (PW), a sliding window technique that identifies the impactful regions of a gene using population-scale clinico-genomic datasets. By sizing analysis windows on the number of variant carriers, rather than the number of variants or nucleotides, statistical power is held constant, enabling the localization of clinical phenotypes and removal of unassociated gene regions. The windows can be built by sliding across either the nucleotide sequence of the gene (through 1D space) or the positions of the amino acids in the folded protein (through 3D space). Using a training set of 350k exomes from the UK Biobank (UKB), we developed PW models for well-established gene-disease associations and tested their accuracy in two independent cohorts (117k UKB exomes and 65k exomes sequenced at Helix in the Healthy Nevada Project, myGenetics, or In Our DNA SC studies). The significant models retained a median of 49% of the qualifying variant carriers in each gene (range 2%-98%), with quantitative traits showing a median effect size improvement of 66% compared with aggregating variants across the entire gene, and binary traits' odds ratios improving by a median of 2.2-fold. PW showcases that electronic health record-based statistical analyses can accurately distinguish between novel coding variants in established genes that will have high phenotypic penetrance and those that will not, unlocking new potential for human genomics research, drug development, variant interpretation, and precision medicine.

Keywords: genetic analysis; rare variants; sliding window.

MeSH terms

  • Base Sequence / genetics
  • Exome / genetics
  • Genetic Predisposition to Disease / genetics
  • Genetic Variation* / genetics
  • Humans
  • Phenotype
  • Protein Folding