The correspondence of cell state changes in diseased organs to peripheral protein signatures is currently unknown. Here, we generated and integrated single-cell transcriptomic and proteomic data from multiple large pulmonary fibrosis patient cohorts. Integration of 233,638 single-cell transcriptomes (n = 61) across three independent cohorts enabled us to derive shifts in cell type proportions and a robust core set of genes altered in lung fibrosis for 45 cell types. Mass spectrometry analysis of lung lavage fluid (n = 124) and plasma (n = 141) proteomes identified distinct protein signatures correlated with diagnosis, lung function, and injury status. A novel SSTR2+ pericyte state correlated with disease severity and was reflected in lavage fluid by increased levels of the complement regulatory factor CFHR1. We further discovered CRTAC1 as a biomarker of alveolar type-2 epithelial cell health status in lavage fluid and plasma. Using cross-modal analysis and machine learning, we identified the cellular source of biomarkers and demonstrated that information transfer between modalities correctly predicts disease status, suggesting feasibility of clinical cell state monitoring through longitudinal sampling of body fluid proteomes.
Keywords: biomarker; data integration; fibrosis; proteomics; single-cell RNA-seq.
© 2021 The Authors. Published under the terms of the CC BY 4.0 license.