Logo image
Efficient marginalization to compute protein posterior probabilities from shotgun mass spectrometry data
Journal article   Peer reviewed

Efficient marginalization to compute protein posterior probabilities from shotgun mass spectrometry data

Oliver Serang, Michael J. MacCoss and William Stafford Noble
Journal of proteome research, Vol.9(10), pp.5346-5357
01/10/2010
PMID: 20712337

Abstract

The problem of identifying proteins from a shotgun proteomics experiment has not been definitively solved. Identifying the proteins in a sample requires ranking them, ideally with interpretable scores. In particular, “degenerate” peptides, which map to multiple proteins, have made such a ranking difficult to compute. The problem of computing posterior probabilities for the proteins, which can be interpreted as confidence in a protein’s presence, has been especially daunting. Previous approaches have either ignored the peptide degeneracy problem completely, addressed it by computing a heuristic set of proteins or heuristic posterior probabilities, or by estimating the posterior probabilities with sampling methods. We present a probabilistic model for protein identification in tandem mass spectrometry that recognizes peptide degeneracy. We then introduce graph-transforming algorithms that facilitate efficient computation of protein probabilities, even for large data sets. We evaluate our identification procedure on five different well-characterized data sets and demonstrate our ability to efficiently compute high-quality protein posteriors.

Metrics

Details

Logo image