High FDRs for methylated peptides III
The MCP paper "Large Scale Mass Spectrometry-based Identifications of Enzyme-mediated Protein Methylation Are Subject to High False Discovery Rates" raises some important questions concerning the accuracy and interpretation of database search results. In this third article, we look at the difference between using counts of matches (PSMs) and counts of distinct sequences to calculate the false discovery rate (FDR).
|files||target PSMs||decoy PSMs||target seqs||decoy seqs|
We searched the data from nostainbands_orbi_1.raw through 28, as described in the first of these articles. This table shows counts at 1% FDR for the merged peak lists from all 28 files together with searches of one half of the files (1-14) and the other half (15-28). Search conditions were identical in all three cases. When we count PSMs, the counts for the two halves can be summed to get the count for the whole. This is not true when we count distinct sequences. The reason is obvious: some of the matched sequences are common to both sets of results. If you want to combine sets of search results, and you are working with FDRs based on counts of distinct sequences, the threshold for 1% FDR has to be re-determined on the merged results, which may not be meaningful if the search conditions are not identical across all of them.
The number of candidate false sequences is limited by the size of the database, but this limit will usually be greater than the number of candidate true sequences. Hypothetically, if we were to re-analyse a sample repeatedly, and include more and more search results from each technical replicate until we reach the stage where no more true sequences can be identified, does this mean the sensitivity at a given FDR can only get worse if additional results are included, because any novel sequences must be false? No, because the score or expect value that represents each sequence is that for the best match. As more data are added, the score (or expect value) distributions for true matches move to higher scores (or lower expect values) ‘faster’ than those for false matches. A good analogy is averaging spectra to improve signal/noise. Each new spectrum adds counts to both peaks and baseline, but the heights of the peaks increase faster than the height of the baseline.
The threshold for 1% FDR based on counts of distinct sequences will generally be more stringent than the threshold for 1% FDR based on counts of PSMs. That is, if you threshold matches to get 1% FDR for PSMs, this doesn’t mean you also have 1% FDR for distinct sequences. (When we say distinct sequences, this could refer to just the primary sequence, or it could extend to modification state or charge state or both. The same considerations would apply.)
In the UNSW paper, global FDRs were based on Percolator q-value < 0.01, which should give 1% FDR for counts of PSMs. For methylated peptides, it sounds as if matches were also accepted if they had a Mascot expect value < 0.05. Some aspects of this double thresholding are a little unclear, but the thresholding was definitely applied to PSMs, not distinct sequences. On the other hand, the FDRs reported for methylated peptides were based on counts of ‘non redundant PSMs’, the term used in the paper for a distinct sequence + modification state combination. Thresholding on PSMs but reporting FDRs based on distinct sequences is not correct. It is important to work with counts of PSMs or distinct sequences consistently, particularly for a large data set where the number of true sequences is very small, just 59 in total (Tables SII and SIII).
In summary, this is an important and well executed study that illustrates some of the limitations of database search:
- We cannot assume that the global FDR applies to a subgroup of matches, such as modified peptides
- Target/decoy estimates the fraction of chance matches to unrelated sequences. It doesn’t model matches to homologous peptides or peptides with alternative arrangements of modifications.
- We depend on competition to exclude certain types of false match
- Database search cannot tell you whether a modification has the correct elemental composition but the wrong structure or a modification is artefactual rather than post-translational
- FDRs can be based on counts of PSMs or counts of distinct sequences but don’t mix them; use one or the other consistently
- Combining sets of search results where the FDR is based on counts of distinct sequences is not straightforward
Finally, we strongly agree with the authors’ view that the burden of proof for a novel PTM should be very heavy, and database search in isolation will often be insufficient evidence.