Evaluation of MHC-II peptide binding prediction servers: applications for vaccine research

Hong Huang Lin; Guang Lan Zhang; Songsak Tongchusak; Ellis L Reinherz; Vladimir Brusic

doi:10.1186/1471-2105-9-S12-S22

Evaluation of MHC-II peptide binding prediction servers: applications for vaccine research

BMC Bioinformatics. 2008 Dec 12;9 Suppl 12(Suppl 12):S22. doi: 10.1186/1471-2105-9-S12-S22.

Authors

Hong Huang Lin¹, Guang Lan Zhang, Songsak Tongchusak, Ellis L Reinherz, Vladimir Brusic

Affiliation

¹ Cancer Vaccine Center, Dana-Farber Cancer Institute, Boston, MA 02215, USA. Honghuang_Lin@dfci.harvard.edu

Abstract

Background: Initiation and regulation of immune responses in humans involves recognition of peptides presented by human leukocyte antigen class II (HLA-II) molecules. These peptides (HLA-II T-cell epitopes) are increasingly important as research targets for the development of vaccines and immunotherapies. HLA-II peptide binding studies involve multiple overlapping peptides spanning individual antigens, as well as complete viral proteomes. Antigen variation in pathogens and tumor antigens, and extensive polymorphism of HLA molecules increase the number of targets for screening studies. Experimental screening methods are expensive and time consuming and reagents are not readily available for many of the HLA class II molecules. Computational prediction methods complement experimental studies, minimize the number of validation experiments, and significantly speed up the epitope mapping process. We collected test data from four independent studies that involved 721 peptide binding assays. Full overlapping studies of four antigens identified binding affinity of 103 peptides to seven common HLA-DR molecules (DRB1*0101, 0301, 0401, 0701, 1101, 1301, and 1501). We used these data to analyze performance of 21 HLA-II binding prediction servers accessible through the WWW.

Results: Because not all servers have predictors for all tested HLA-II molecules, we assessed a total of 113 predictors. The length of test peptides ranged from 15 to 19 amino acids. We tried three prediction strategies - the best 9-mer within the longer peptide, the average of best three 9-mer predictions, and the average of all 9-mer predictions within the longer peptide. The best strategy was the identification of a single best 9-mer within the longer peptide. Overall, measured by the receiver operating characteristic method (AROC), 17 predictors showed good (AROC > 0.8), 41 showed marginal (AROC > 0.7), and 55 showed poor performance (AROC < 0.7). Good performance predictors included HLA-DRB1*0101 (seven), 1101 (six), 0401 (three), and 0701 (one). The best individual predictor was NETMHCIIPAN, closely followed by PROPRED, IEDB (Consensus), and MULTIPRED (SVM). None of the individual predictors was shown to be suitable for prediction of promiscuous peptides. Current predictive capabilities allow prediction of only 50% of actual T-cell epitopes using practical thresholds.

Conclusion: The available HLA-II servers do not match prediction capabilities of HLA-I predictors. Currently available HLA-II prediction servers offer only a limited prediction accuracy and the development of improved predictors is needed for large-scale studies, such as proteome-wide epitope mapping. The requirements for accuracy of HLA-II binding predictions are stringent because of the substantial effect of false positives.

Publication types

Research Support, N.I.H., Extramural
Research Support, Non-U.S. Gov't

MeSH terms

Algorithms
Antigens / chemistry
Binding Sites
Computational Biology / methods*
Epitope Mapping
Epitopes / chemistry
Epitopes, T-Lymphocyte / chemistry
False Positive Reactions
Humans
Markov Chains
Models, Theoretical
Peptides / chemistry*
Protein Binding
ROC Curve
Vaccines / chemistry*

Substances

Antigens
Epitopes
Epitopes, T-Lymphocyte
Peptides
Vaccines

Grants and funding

U19 AI57330/AI/NIAID NIH HHS/United States