INFERYS rescoring: Boosting peptide identifications and scoring confidence of database search results
Database search engines for bottom-up proteomics largely ignore peptide fragment ion intensities during the automated scoring of tandem mass spectra against protein databases. Recent advances in deep learning allow the accurate prediction of peptide fragment ion intensities. Using these predictions...
Saved in:
Published in: | Rapid communications in mass spectrometry p. e9128 |
---|---|
Main Authors: | , , , , , , , , , , , , , |
Format: | Journal Article |
Language: | English |
Published: |
England
28-06-2021
|
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Database search engines for bottom-up proteomics largely ignore peptide fragment ion intensities during the automated scoring of tandem mass spectra against protein databases. Recent advances in deep learning allow the accurate prediction of peptide fragment ion intensities. Using these predictions to calculate additional intensity-based scores helps to overcome this drawback. Here, we describe a processing workflow termed INFERYS™ rescoring for the intensity-based rescoring of Sequest HT search engine results in Thermo Scientific™ Proteome Discoverer™ 2.5 software. The workflow is based on the deep learning platform INFERYS capable of predicting fragment ion intensities, which runs on personal computers without the need for graphics processing units. This workflow calculates intensity-based scores comparing peptide spectrum matches from Sequest HT and predicted spectra. Resulting scores are combined with classical search engine scores for input to the false discovery rate estimation tool Percolator. We demonstrate the merits of this approach by analyzing a classical HeLa standard sample and exemplify how this workflow leads to a better separation of target and decoy identifications, in turn resulting in increased peptide spectrum match, peptide and protein identification numbers. On an immunopeptidome dataset, this workflow leads to a 50% increase in identified peptides, emphasizing the advantage of intensity-based scores when analyzing low-intensity spectra or analytes with very similar physicochemical properties that require vast search spaces. Overall, the end-to-end integration of INFERYS rescoring enables simple and easy access to a powerful enhancement to classical database search engines, promising a deeper, more confident and more comprehensive analysis of proteomic data from any organism by unlocking the intensity dimension of tandem mass spectra for identification and more confident scoring. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
ISSN: | 0951-4198 1097-0231 |
DOI: | 10.1002/rcm.9128 |