MicroRNA (miRNA) expression is frequently deregulated in human disease, in contrast, disease-associated miRNA mutations are understudied. We developed Annotative Database of miRNA Elements, ADmiRE, which combines multiple existing and new biological annotations to aid prioritization of causal miRNA variation. We annotated 10,206 mature (3,257 within seed region) miRNA variants from multiple large sequencing datasets including gnomAD (15,496 genomes; 123,136 exomes). The pattern of miRNA variation closely resembles protein-coding exonic regions, with no difference between intragenic and intergenic miRNAs (P = 0.56), and high confidence miRNAs demonstrate higher sequence constraint (P < 0.001). Conservation analysis across 100 vertebrates identified 765 highly conserved miRNAs that also have limited genetic variation in gnomAD. We applied ADmiRE to the TCGA PanCancerAtlas WES dataset containing over 10,000 individuals across 33 adult cancers and annotated 1,267 germline (rare in gnomAD) and 1,492 somatic miRNA variants. Several miRNA families with deregulated gene expression in cancer have low levels of both somatic and germline variants, e.g., let-7 and miR-10. In addition to known somatic miR-142 mutations in hematologic cancers, we describe novel somatic miR-21 mutations in esophageal cancers impacting downstream miRNA targets. Through the development of ADmiRE, we present a framework for annotation and prioritization of miRNA variation in disease datasets.
Keywords: cancer; conservation; genomics; microRNA; variant annotation.
© 2018 Wiley Periodicals, Inc.