Mascot: The trusted reference standard for protein identification by mass spectrometry for 25 years

Mascot Cluster

Mascot Server has been designed to be embarrassingly parallel, which means that each search can be divided into parts that run independently, in separate threads. Support for parallel execution on a networked cluster of PCs is built-in to Mascot, and does not require any special operating system or grid engine.

Hardware

Mascot Server is licensed by the CPU. Each additional CPU in the licence enables searches to run on an additional 4 physical cores or 8 threads (sometimes called logical cores). Only the processors used for searching require a Mascot licence. It is a good idea to have a few spare cores to run the web server, handle database updates, and generate reports, etc. This makes the server responsive even when there are several searches running, using all the processor time on the licensed cores.

In single thread benchmarks, the fastest processors tend to have at most 24 cores. With more than 24 cores, performance per core degrades, because the processor has to limit the per-core clock speed. It simply cannot dissipate heat fast enough, so this is sometimes called thermal throttling.

Single processors are available with 64+ cores, and some PCs can fit two processor sockets, but they will be very expensive. The speed of individual cores may not be as high as relatively inexpensive ‘consumer grade’ processors. Another issue is I/O throughput. Even with a very fast RAID10 array, either disk or RAM may become a bottleneck, which starves the processor of useful work.

For licences of 5 CPU and larger, we recommend running Mascot in cluster mode. A cluster of single or dual processor boxes will usually offer the most cost effective solution, and cluster search speed is not subject to thermal throttling or I/O bottlenecks.

As an example, a 12 CPU licence is good for 48 cores or 96 threads. A cluster of 7 PCs, each with a single 8-core processor, would provide 48 cores for searching and 8 left over for other purposes. In practice, having a non-searching master node is ideal because it also gives you a spare node, in case one of the search nodes has a hardware failure.

Result files are stored on the master node, so this needs access to plenty of disk storage. Search nodes only need local storage for program files and the compressed sequence database files.

Administration

Mascot Server is regularly updated as we add new functionality. Mascot updates need only be installed on the master node. Distribution of the program and database files to the search nodes is fully automatic, whether because of an update or because a node has been exchanged due to a hardware problem.

Database Status

Mascot administration tools provide web browser based system status reports. These are continuously updated and show at a glance important parameters such as processor usage and free disk space for each of the cluster nodes. As an option, critical alerts can also be sent to the system administrator by email.

Turn-key Systems

We don’t supply turn-key systems. One reason is that they become very expensive, because we have to cover the cost of configuration, soak test, shipping, on-site installation, and warranty. Another reason is that installing the software is the best way for a system administrator to become familiar with the system. If you really don’t want to deal with hardware, the simplest option is to run Mascot in the cloud.