Posted by Patrick Emery (June 20, 2018)

Mascot Distiller 2.7: Farewell to re-gridding

We recently released Mascot Distiller 2.7. The main new feature of this release is a change to how peak detection works on raw profile datasets that have been saved as sparse, or compressed data, with runs of zero values dropped. Common examples of this are Thermo Orbitrap and Sciex Analyst datasets saved as profile data. For these types of data, Mascot Distiller peak detection is now significantly faster. In addition, we’ve improved 13C peak and MS/MS fragment charge state detection.

Re-gridding

Peak detection in Mascot Distiller works by attempting to fit the ideal isotope distribution to the experimental data. In previous versions, peak picking required that the profile data was on a linear mass scale, with evenly distributed data points over the m/z axis. However, some raw profile data formats are compressed by dropping runs of zero intensity data points. Prior to Mascot Distiller 2.7, it was therefore necessary to re-grid these data into evenly distributed m/z values. Re-gridding was also required before spectra with non-linear mass scales could be summed.

In Mascot Distiller 2.7, peak detection works directly on the raw profile data, without the need to re-grid. You can find a more in-depth look at the issues surrounding re-gridding in this presentation from our ASMS 2018 breakfast meetings.

Comparison between Mascot Distiller 2.6 and 2.7

Because of these changes, we’d expect to see a dramatic improvement in processing time when peak picking sparse profile data. To test this, we took a SILAC quantitation dataset generated using a Thermo QExactive from the PRIDE repository (PXD004607). The dataset is comprised of 10 files, with a total of 1.2 million MS and MS/MS spectra. We then processed it using Distiller 2.6 and 2.7 using the following protocols:

  • Distiller 2.6:
    • Peak picking with re-gridding value of 600 points per Da
    • Decharge MS/MS peaklists to MH+
    • Search with Mascot 2.6
    • Quantify SILAC results using Mascot Distiller
  • Distiller 2.7:
    • Peak picking (no re-gridding required)
    • Decharge MS/MS peaklists to MH+
    • Search with Mascot 2.6
    • Quantify SILAC results using Mascot Distiller
During the quantitation phase, Distiller has to carry out some additional peak detection on MS scans in the XIC regions, so we’d expect to see improvements in the speed of this in Mascot Distiller 2.7 as well. Results are presented in Table 1 below:

 Mascot Distiller 2.6Mascot Distiller 2.7
Peak picking8 days, 3 hours and 36 minutes2 hours and 44 minutes
SILAC quantitation2 days, 14 hours and 14 minutes15 hours and 45 minutes
Total10 days, 17 hours and 50 minutes18 hours and 29 minutes
Table 1: Comparison of the time taken to carry out peak picking and SILAC quantitation on the PXD004607 dataset in Mascot Distiller releases 2.6 and 2.7

As you can see, for this dataset where re-gridding was previously required, both peak picking and SILAC quantitation are dramatically faster with Mascot Distiller 2.7. You’ll see similar improvements for any dataset where the raw data are saved as compressed profile data. If your raw data are saved as true profile data, as is common from Bruker instruments, or as centroids, then you won’t see this dramatic improvement.

In addition to significantly speeding up the processing time, we found approximately 20% more significant peptide matches at a 1% FDR:

 PSMs above homology
Mascot Distiller 2.6193981
Mascot Distiller 2.7236043
Table 2: Counts of PSMs with scores above the homology threshold at a 1% FDR for search results from of the peaklists generated by Mascot Distiller 2.6 and 2.7

There are three reasons for this increase. In Mascot Distiller 2.7 we’ve improved the accuracy of both 13C peak detection and fragment charge state determination. We’re also seeing better ions scores from many of the peaklists compared with the Distiller 2.6 peaklists. This is due to Distiller 2.7 generating ‘cleaner’ peaklists with fewer noise peaks included, giving us better signal to noise and resulting in slightly better ions scores for equivalent matches, which in some cases raises the match score above the significance threshold. Examples of all three types of improvement are shown in this presentation.

If you already have a licence for Mascot Distiller, then 2.7 is a free update. If not, and you’d to evaluate it, we offer a 30 day trial of Mascot Distiller. For details, please see http://www.matrixscience.com/distiller_download.html

Leave a Reply

Your email address will not be published. Required fields are marked *

*

HTML tags are not allowed.