Addressing the elusive specificity of cysteine cathepsins, which in contrast to caspases and trypsin-like proteases lack strict specificity determining P1 pocket, calls for innovative approaches. Proteomic analysis of cell lysates with human cathepsins K, V, B, L, S, and F identified 30,000 cleavage sites, which we analyzed by software platform SAPS-ESI (Statistical Approach to Peptidyl Substrate-Enzyme Specific Interactions). SAPS-ESI is used to generate clusters and training sets for support vector machine learning. Cleavage site predictions on the SARS-CoV-2 S protein, confirmed experimentally, expose the most probable first cut under physiological conditions and suggested furin-like behavior of cathepsins. Crystal structure analysis of representative peptides in complex with cathepsin V reveals rigid and flexible sites consistent with analysis of proteomics data by SAPS-ESI that correspond to positions with heterogeneous and homogeneous distribution of residues. Thereby support for design of selective cleavable linkers of drug conjugates and drug discovery studies is provided.
© 2023. The Author(s).