Protein S-palmitoylation is a reversible lipophilic posttranslational modification regulating a diverse number of signaling pathways. Within transmembrane proteins (TMPs), S-palmitoylation is implicated in conditions from inflammatory disorders to respiratory viral infections. Many small-scale experiments have observed S-palmitoylation at juxtamembrane Cys residues. However, most large-scale S-palmitoyl discovery efforts rely on trypsin-based proteomics within which hydrophobic juxtamembrane regions are likely underrepresented. Machine learning- by virtue of its freedom from experimental constraints - is particularly well suited to address this discovery gap surrounding TMP S-palmitoylation. Utilizing a UniProt-derived feature set, a gradient boosted machine learning tool (TopoPalmTree) was constructed and applied to a holdout dataset of viral S-palmitoylated proteins. Upon application to the mouse TMP proteome, 1591 putative S-palmitoyl sites (i.e. not listed in SwissPalm or UniProt) were identified. Two lung-expressed S-palmitoyl candidates (synaptobrevin Vamp5 and water channel Aquaporin-5) were experimentally assessed. Finally, TopoPalmTree was used for rational design of an S-palmitoyl site on KDEL-Receptor 2. This readily interpretable model aligns the innumerable small-scale experiments observing juxtamembrane S-palmitoylation into a proteomic tool for TMP S-palmitoyl discovery and design, thus facilitating future investigations of this important modification.
Keywords: S-acylation; S-palmitoylation; gradient boosting; machine learning; transmembrane protein.