Posted by Richard Jacob (March 15, 2019)

Back to basics 4: Mascot Daemon

Mascot Daemon is a client to Mascot Server that can automate the processing of raw data to peak lists and submit multiple searches to a central Mascot Server. It is included with the Mascot Server licence and can be installed on as many computers in the lab as you like. Processing raw data files will use CPU resources, so you may not want to install it on personal systems.

Mascot Daemon the application is made up of two parts: the user interface through which you set up tasks and the daemon engine that runs in the background and actually carries out the work. You can see the background task’s cogwheel icon in the notification area of the taskbar. The two halves communicate through the task database (TaskDB). This is where all the information about the tasks, the files names and paths, etc., is stored.

Different ways of processing data

The simplest way to use Mascot Server is to search peak lists generated by Mass Spec vendor or other third-party software. The most common format is MGF (Mascot Generic Format), although other formats are also accepted. In Daemon, peak lists can be combined into a single search by checking the “Merge MS/MS files into a single search” box.

Most users quickly graduate to using raw MS data and a Data import filter to convert that data to a peak list prior to searching. Mascot Distiller can be used as a data import filter and can handle data from any of the major instrument vendors. Using a Distiller-compatible quantitation method allows Daemon to automatically trigger the quantitation once the search is complete. Additionally, a number of third-party peak picking applications are supported, with ProteWizard’s MSconvert being the most popular of these.

The HUPO community data format mzML is also supported. The mzML format can be used for both raw data and peak lists so you need to know that you have the peak list format before searching it directly. Some post search data analysis tools like the Trans-Proteomic Pipeline require that you search Mascot with a mzML file as other parts of the pipeline need to use the same peak list for their analysis.

Complex search strategies

A Mascot Daemon task is a Mascot search with a single set of search parameters. It is possible to chain multiple tasks together with Follow-up tasks. Follow-up tasks will take the data used in one task and research it with a different set of conditions. A Follow-up task can search all the queries in a peak lists or just the ones that did not obtain a sufficiently good match. Multiple Follow-up tasks can be chained together creating a sieve approach to the analysis. This technique was put to good use in analysing a Histone dataset.

Other uses for Follow-up tasks include using it as a filter identifying and removing matches from a search against a contaminants database prior to searching the main database or building a strategy that searches both spectral libraries and protein and/or DNA sequence databases.


Figure 1: Histone analysis: Starting from task 28 the data file progress upwards through the chain of Follow-up tasks.

The key to automation

Mascot Daemon allows you to easily automate a batch of searches but you can go beyond just automating the peak picking and searching. The Auto-export button takes you to the configuration options for exporting results in any of the supported file formats. With quantitation data, if you are using Mascot Distiller for the peak picking, you can configure Daemon to automatically calculate and export the quantitation results as XML files. Turn this feature on in the Daemon preferences, general tab. Daemon saves all of the exported files along with any peak list file generated by the data import filter into the “MGF directory”. The default location can be edited in the Daemon preferences, Data import filters dialog box.

The key to more advanced automation is the External Processes Dialog. From here you can call programs or scripts before or after a task or search. Daemon uses tokens to pass values such as file names and paths to the external application. This way you can trigger a program that performs a task after every search is complete, for example sending an email or passing the results to another program for further analysis. We have helped customers build scripts for preprocessing peak lists prior to searching or copying and renaming results files post search ready for importing into a lab database.


Figure 2: An external task has been set up to run before each search that calls a script to clean up the header information on a peak list file. The Auto-export feature has also been configured so the button title is on bold too.

Tips for core labs and other multi user environments

Daemon has a number of features that make it easier to use within a core laboratory. For example, it can be configured to allow running searches in the name of another user. Enable Mascot Server security and make the Daemon user a member of the PowerUsers group or a group with the security task “CLIENT: For Mascot Daemon, allow spoofing of another user”. The “Owner” field in the task editor tab becomes a drop down field. This allows a core lab member to run searches on behalf of their customers or collaborators so that they can view their results but not rerun the search. Spoofing users and sharing results with collaborators has been covered in a previous blog article.


Figure 3: List of usernames that can be selected from to “own” the search.

Within the core lab there may be multiple computers running Mascot Daemon clients. These can be configured to share a common TaskDB database making it easy to track which tasks are running and sending searches to Mascot Server as well as centralizing links to results. The Mascot Server search parameters used with a Daemon task are normally saved as a text file. You could save these files to a network share so that they are accessible from all the Daemon computers. Alternatively Daemon also allows you to save the search parameters directly in the TaskDB. Activate this feature in Daemon preferences, general tab.

Although Daemon was not originally designed to be used as the main interface to the search results, it is often used as such. If you are using a shared task database or storing the search parameters in it, it makes sense to back up the database regularly. You can do this through the the ODBC connection dialog in Daemon preferences.

Anything else I should know?

If the Mascot Server is running on Linux, there is no restriction to the size of the peak list. On Windows, the IIS webserver restricts uploaded files to 2GB. When both Daemon and Server are installed on the same Windows computer, Daemon can submit searches on the command line bypassing the file size limit. This feature is activated by default and can be changed in the Daemon preferences, general tab.

One final feature that can save you a bit of time is the “Clone” button. Cloning a task is a quick way to set up a new task that has the same or very similar settings to an existing task. It is particularly useful if you frequently use a combination of Data import filters, auto exports and external processes.

Keywords: , , ,

Leave a Reply

Your email address will not be published. Required fields are marked *

*

HTML tags are not allowed.