Improving Tandem Mass Spectrum Identification Using Peptide Retention Time Prediction across Diverse Chromatography Conditions
Aaron Klammer, Xianhua Yi, Michael J. MacCoss and William Stafford Noble
Proceedings of the International Conference on Research in Computational Biology (RECOMB), 2007
Analytical Chemistry. 79(16):6111-6118, 2007.
Abstract
Most tandem mass spectrum identification algorithms use information only from the final spectrum, ignoring precursor information such as peptide retention time (RT). Efforts to exploit peptide RT for peptide identification can be frustrated by its variability across liquid chromatography analyses. We show that peptide RT can be reliably predicted by training a support vector regressor on a single chromatography run. This dynamically trained model outperforms a published statically trained model of peptide RT across diverse chromatography conditions. In addition, the model can be used to filter peptide identifications that produce large discrepancies between observed and predicted RT. After filtering, estimated true positive peptide identifications increase by as much as 50% at a false discovery rate of 3%, with the largest increase for non-specific cleavage with elastase.
Supplementary figures.
Supplementary data. SEQUEST
results were produced by searching spectra using this parameter file against one of these sequence databases: target, decoy1, decoy2 or decoy3. Thems2
andsqt
file formats are described in McDonald et al.(2004). The raw weights for the trained SVM are here.
The software used to generate the results in the paper.