Protein adsorption on solid surfaces is a process relevant to biological, medical, industrial, and environmental applications. Despite this wide interest and advancement in measurement techniques, the complexity of protein adsorption has frustrated its accurate prediction. To address this challenge, here, data regarding protein adsorption reported in the last four decades was collected, checked for completeness and correctness, organized, and archived in an upgraded, freely accessible Biomolecular Adsorption Database, which is equivalent to a large-scale, ad hoc, crowd-sourced multifactorial experiment. The shape and physicochemical properties of the proteins present in the database were quantified on their molecular surfaces using an in-house program (ProMS) operating as an add-on to the PyMol software. Machine learning-based analysis indicated that protein adsorption on hydrophobic and hydrophilic surfaces is modulated by different sets of operational, structural, and molecular surface-based physicochemical parameters. Separately, the adsorption data regarding four "benchmark" proteins, i.e., lysozyme, albumin, IgG, and fibrinogen, was processed by piecewise linear regression with the protein monolayer acting as breakpoint, using the linearization of the Langmuir isotherm formalism, resulting in semiempirical relationships predicting protein adsorption. These relationships, derived separately for hydrophilic and hydrophobic surfaces, described well the protein concentration on the surface as a function of the protein concentration in solution, adsorbing surface contact angle, ionic strength, pH, and temperature of the carrying fluid, and the difference between pH and the isoelectric point of the protein. When applying the semiempirical relationships derived for benchmark proteins to two other "test" proteins with known PDB structure, i.e., β-lactoglobulin and α-lactalbumin, the errors of this extrapolation were found to be in a linear relationship with the dissimilarity between the benchmark and the test proteins. The work presented here can be used for the estimation of operational parameters modulating protein adsorption for various applications such as diagnostic devices, pharmaceuticals, biomaterials, or the food industry.
Keywords: Langmuir isotherm; atomic hydrophobicity; database; molecular surface; multilinear regression with breakpoint; protein adsorption.