Improving tandem mass spectrum identification using peptide retention time prediction across diverse chromatography conditions

Aaron A. Klammer, Xinhua Yi, Michael J. MacCoss and William Stafford Noble

Analytical Chemistry. 79(16):6111-6118, 2007.


Most algorithms for identifying peptides from tandem mass spectra use information only from the final spectrum, ignoring non-mass-based information acquired routinely in liquid chromatography tandem mass spectrometry analyses. One physiochemical property that is always obtained but rarely exploited is peptide chromatographic retention time. Efforts to use chromatographic retention time to improve peptide identification are complicated because of the variability of retention time in different experimental conditions-making retention time calculations nongeneralizable. We show that peptide retention time can be reliably predicted by training and testing a support vector regressor on a small collection of data from a single liquid chromatography run. This model can be used to filter peptide identifications with observed retention time that deviates from predicted retention time. After filtering, positive peptide identifications increase by as much as 50% at a false discovery rate of 3%. We demonstrate that our dynamically trained model generalizes well across diverse chromatography conditions and methods for generating peptides, in particular improving peptide identification using nonspecific proteases.

Analytical Chemistry
Supplementary data