Background: Perturbed posttranslational modification (PTM) landscapes commonly cause pathological phenotypes. The Cancer Genome Atlas (TCGA) project profiles thousands of tumors allowing the identification of spontaneous cancer-driving mutations, while Uniprot and dbSNP manage genetic disease-associated variants in the human population. PhosphoSitePlus (PSP) is the most comprehensive resource for studying experimentally observed PTM sites and the only repository with daily updates on functional annotations for many of these sites. To elucidate altered PTM landscapes on a large scale, we integrated disease-associated mutations from TCGA, Uniprot, and dbSNP with PTM sites from PhosphoSitePlus. We characterized each dataset individually, compared somatic with germline mutations, and analyzed PTM sites intersecting directly with disease variants. To assess the impact of mutations in the flanking regions of phosphosites, we developed DeltaScansite, a pipeline that compares Scansite predictions on wild type versus mutated sequences. Disease mutations are also visualized in PhosphoSitePlus.
Results: Characterization of somatic variants revealed oncoprotein-like mutation profiles of U2AF1, PGM5, and several other proteins, showing alteration patterns similar to germline mutations. The union of all datasets uncovered previously unknown losses and gains of PTM events in diseases unevenly distributed across different PTM types. Focusing on phosphorylation, our DeltaScansite workflow predicted perturbed signaling networks consistent with calculations by the machine learning method MIMP.
Conclusions: We discovered oncoprotein-like profiles in TCGA and mutations that presumably modify protein function by impacting PTM sites directly or by rewiring upstream regulation. The resulting datasets are enriched with functional annotations from PhosphoSitePlus and present a unique resource for potential biomarkers or disease drivers.
Keywords: Cancer; Disease; PhosphoSitePlus; Posttranslational modification; Signal transduction; TCGA; dbSNP.