Improving tandem mass spectrum identification using peptide retention time prediction across diverse chromatography conditions

Anal Chem. 2007 Aug 15;79(16):6111-8. doi: 10.1021/ac070262k. Epub 2007 Jul 11.

Abstract

Most algorithms for identifying peptides from tandem mass spectra use information only from the final spectrum, ignoring non-mass-based information acquired routinely in liquid chromatography tandem mass spectrometry analyses. One physiochemical property that is always obtained but rarely exploited is peptide chromatographic retention time. Efforts to use chromatographic retention time to improve peptide identification are complicated because of the variability of retention time in different experimental conditions-making retention time calculations nongeneralizable. We show that peptide retention time can be reliably predicted by training and testing a support vector regressor on a small collection of data from a single liquid chromatography run. This model can be used to filter peptide identifications with observed retention time that deviates from predicted retention time. After filtering, positive peptide identifications increase by as much as 50% at a false discovery rate of 3%. We demonstrate that our dynamically trained model generalizes well across diverse chromatography conditions and methods for generating peptides, in particular improving peptide identification using nonspecific proteases.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Artificial Intelligence*
  • Chromatography, Liquid / methods*
  • False Positive Reactions
  • Peptide Hydrolases
  • Peptides / analysis*
  • Tandem Mass Spectrometry / methods*
  • Tandem Mass Spectrometry / standards

Substances

  • Peptides
  • Peptide Hydrolases