Direct maximization of protein identifications from tandem mass spectra

Marina Spivak, Jason Weston, Michael J. MacCoss and William Stafford Noble

Submitted for publication.


Abstract

The goal of many shotgun proteomics experiments is to identify with high confidence as many proteins as possible from a complex biological mixture. Existing solutions to this problem typically subdivide the task into two stages, first identifying a collection of peptides with a low false discovery rate, and then inferring from the peptides a corresponding set of proteins. In contrast, we formulate the protein identification problem as a single optimization problem, which we solve using machine learning methods. The resulting algorithm directly controls the relevant error rate, can incorporate a wide variety of evidence and, for complex samples, provides 24--74% more protein identifications than the current state of the art.



Manuscript
Supplement
Home