Integrating single cell expression quantitative trait loci summary statistics to understand complex trait risk genes

Nat Commun. 2024 May 20;15(1):4260. doi: 10.1038/s41467-024-48143-1.

Abstract

Transcriptome-wide association study (TWAS) is a popular approach to dissect the functional consequence of disease associated non-coding variants. Most existing TWAS use bulk tissues and may not have the resolution to reveal cell-type specific target genes. Single-cell expression quantitative trait loci (sc-eQTL) datasets are emerging. The largest bulk- and sc-eQTL datasets are most conveniently available as summary statistics, but have not been broadly utilized in TWAS. Here, we present a new method EXPRESSO (EXpression PREdiction with Summary Statistics Only), to analyze sc-eQTL summary statistics, which also integrates 3D genomic data and epigenomic annotation to prioritize causal variants. EXPRESSO substantially improves existing methods. We apply EXPRESSO to analyze multi-ancestry GWAS datasets for 14 autoimmune diseases. EXPRESSO uniquely identifies 958 novel gene x trait associations, which is 26% more than the second-best method. Among them, 492 are unique to cell type level analysis and missed by TWAS using whole blood. We also develop a cell type aware drug repurposing pipeline, which leverages EXPRESSO results to identify drug compounds that can reverse disease gene expressions in relevant cell types. Our results point to multiple drugs with therapeutic potentials, including metformin for type 1 diabetes, and vitamin K for ulcerative colitis.

MeSH terms

  • Autoimmune Diseases / genetics
  • Gene Expression Profiling / methods
  • Genetic Predisposition to Disease / genetics
  • Genome-Wide Association Study* / methods
  • Humans
  • Multifactorial Inheritance / genetics
  • Polymorphism, Single Nucleotide
  • Quantitative Trait Loci*
  • Single-Cell Analysis* / methods
  • Transcriptome / genetics