Hosting Mascot Server on Amazon Web Services

We provide a turn-key Amazon Machine Image (AMI) for provisioning Mascot Server on Amazon Web Services (AWS). Although AWS is used as the example on this help page, there are numerous companies offering infrastructure as a service, such as AWS, Microsoft Azure and Google Cloud Platform. The basic requirements and Mascot setup are the same in each case, while some details such as cost and the technical steps of setting up a VM vary by cloud platform.

For a general overview and discussion on hosting Mascot Server “in the cloud”, see Mascot Server in the cloud. If you are setting up Mascot Server on a different cloud platform, then after provisioning a suitable VM, you can simply install Mascot Server following the steps in the Installation & Setup manual.

Overview

It is very easy to host your Mascot Server on Amazon Web Services (AWS). This can be as secure and private as having a physical server in-house, possibly more so. The benefits of using the AWS cloud include:

  • Turn-key solution – Mascot Server is pre-installed together with several popular sequence databases.
  • Reduced up-front costs – Effectively, you rent the computer and only pay for the time you use.
  • High performance – Connectivity to the Internet for downloading databases is excellent and the AWS hardware uses fast processors and solid state disks.
  • Snapshots, cloud storage – Easy and secure data backup.
  • Convenience – Hardware maintenance and upgrading is outsourced to a third party.
  • Free data transfer to other virtual servers in the same AWS region – details in Overview of Data Transfer Costs for Common Architectures.

The instructions on this page are a much simplified, step-by-step guide to help someone without prior experience or specialist knowledge to get up and running. The procedures do not cover all eventualities and are no substitute for Amazon’s excellent documentation:

AWS pricing

AWS virtual servers are referred to as ‘instances’. Take a look at the pricing for the different types of instance. You will notice that Windows instances cost substantially more than Linux instances. Mascot is pre-installed and configured, and routine interaction is all through a web browser, making most aspects operating system independent. But, once in a while, you will need to do something at the ‘command line’. If there is no-one in your group familiar with basic Linux commands, it’s best to choose a Windows instance, even though the cost is higher.

AWS is divided into Regions, and there are small differences in pricing between the regions. One reason to choose a particular region is if you anticipate transferring large amounts of data to another AWS virtual server. There is no Data Transfer charge between two servers within the same region, but data transferred between AWS servers in different regions is charged as Internet Data Transfer on both sides of the transfer.

Mascot Server licence

If you already have a Mascot Server licence, and it is under warranty or support, there is no charge for moving it to Amazon Web Services cloud. If you wish to buy a new licence, the cost for a perpetual licence is the same whether it is in-house or in the cloud. Contact sales@matrixscience.com for full details.

A Mascot Server licence is locked to the primary MAC address of the computer. If you want to move to new hardware, you have to request a new product key and go through the registration procedure. For a ‘physical’ PC, this happens rarely. For a virtual PC, you have to be more careful, because each time you stop an instance on AWS EC2, it effectively ceases to exist. The procedure described on this page creates a persistent Network Interface and attaches this to the instance. This provides a fixed MAC address, which means you can stop and start your instance as you wish.

An Amazon virtual CPU (vCPU) is equivalent to a single logical processor. That is, 2 vCPU are equivalent to a single physical core of a hyperthreaded Intel processor. Each CPU in your Mascot licence will use 8 vCPUs (vCPU) for searches. The most cost-effective instances for Mascot Servers are M5:

  • A 1-CPU Mascot licence will use fully utilise an m5.2xlarge instance.
  • A 2-CPU Mascot licence will use fully utilise an m5.4xlarge instance.
  • A 6-CPU Mascot licence will use fully utilise an m5.12xlarge instance.

If your licence is 8 CPU or more, it is best to configure Mascot in cluster mode.

Preliminaries

Unless you already have an account, sign up to Amazon Web Services. Note that account activation can take up to two hours and you cannot create an instance during this time.

Amazon EC2 uses public key cryptography to secure command line access. Unless you are an existing user of AWS, and already have a suitable key pair, you need to create one now.

  • From the AWS Management Console, choose EC2 (Elastic Compute Cloud).
  • Choose US East (N. Virginia) from the drop-down list in the top menu bar, next to your user name.
  • In the Navigation pane, under Network and Security, click Key Pairs.
  • The Key Pairs page displays your Amazon EC2 key pairs. If you haven’t created any yet, the list is empty, and instead shows the Create Key Pair button.
  • Choose Create Key Pair
  • Type a key pair name, and click Create. It doesn’t matter what you name it, but make it something you can easily remember.
  • The key pair is created, and the download of your private key begins. It will be called name.pem (Linux-OpenSSH) or name.ppk (Windows-Putty), where name represents the name you gave to your key pair. PEM format can be easily converted to PPK and vice-versa.
  • Keep this key pair safe. You will need it for command line access to your Mascot Server via SSH (Linux) or Remote Desktop (Windows).

Create your Mascot Cloud

In the AWS Management Console, search for CloudFormation.

  • Choose US East (N. Virginia) from the drop-down list in the top menu bar, next to your user name.
  • Choose Create Stack.
  • Under Specify template, go to ‘Amazon S3 URL‘ and paste one of these URLs:
    • Linux: https://matrix-science-templates.s3.amazonaws.com/Mascot-2.8-Linux.template
    • Windows: https://matrix-science-templates.s3.amazonaws.com/Mascot-2.8-Windows.template
  • Choose Next.
  • Stack name: Enter a short name for the server to be displayed in AWS, e.g. Mascot-Server.
  • CPU: Enter the number of CPU in your Mascot licence (The maximum size for a single instance is 8 CPU. Refer to the cluster section for 8 CPU or more.)
  • KeyName: Select the key pair you created earlier.
  • SSHLocation: Enter the external IP address(es) from which you will access the new Mascot Server in CIDR format. For example, a single IP address would be entered as ’83.217.111.202/32′. If your PC is on a LAN, it is likely that the external IP address will be different from the internal IP address. The easiest way to discover your external IP address is to visit a web page such as What Is My IP?
  • Storage: Specify storage in GB, with a minimum of 300GB.
  • Choose Next and then Next again.
  • If you wish, you can follow the Estimate cost link to see your estimated monthly bill.
  • Choose Create Stack.

Progress messages appear on the CloudFormation Events tab. Once you see green message ‘CREATE_COMPLETE’, you can select the Outputs tab and make a browser bookmark for the link to your new Mascot Server home page.

When it first boots up, the server has to complete some initial configuration steps. You can check the status by following the link for the instance on the resources tab. It’s best to wait until the Status checks show ’2/2 checks passed’ before proceeding further. For a Windows instance, it may be several minutes longer before the server responds.

  • To register your product key, follow the Register product key link in the Outputs tab.
  • Choose Register online now and fill out the registration form.
  • On the confirmation page, choose save now to save the licence file to the server. A copy of the licence file will be sent by email.
  • Choose View Database Status. Initially, the page will display ‘Starting up’. After a short time, the databases that are currently configured will be displayed. When these show Status as ‘In use’, the system is ready for use.

If you are new to Mascot Server, the HTML help pages provide user documentation, dealing with searches and reports.

  • The Mascot Server Installation and Setup manual, linked from the Mascot home page, covers configuration topics, and is mainly for system administrators.
  • Read the Security section, below, and decide how you wish to control access to your Mascot Server

Stopping and starting the server

If you want to stop your Mascot Server, to avoid paying the hourly charges when it is not in use:

  • From the AWS Management Console, choose EC2 (Elastic Compute Cloud)
  • Under Instances, select Instances
  • Select the Mascot Server instance by means of a checkbox
  • From the Actions menu, choose Instance State, Stop and confirm

To re-start your Mascot Server:

  • From the AWS Management Console, choose EC2 (Elastic Compute Cloud)
  • Under Instances, select Instances
  • Select the Mascot Server instance by means of a checkbox
  • From the Actions menu, choose Instance State, Start and confirm

AWS security (firewall) configuration

If you followed the advice on configuring the security group of your Mascot Server instance, HTTP access is only possible from your current IP address. If you are on an organisation LAN, this is likely to use network address translation, so that all communication with the Internet uses this same IP address, or a narrow range of IP addresses, making access to the server possible from other computers on the same LAN, but not from other locations.

You now need to decide how to control access to your Mascot Server. If the number of users is small, using IP address filtering may be right solution. You can open up access to other specific IP addresses by adding them to the security group rules, but you must never open it up to ‘the world’. Most aspects of Mascot Server configuration are controlled through web pages, and you don’t want a malicious stranger making changes or trying to hack your server. In any case, this would be in breach of your Mascot Server End-User Licence Agreement, which is specific to your immediate organisation.

If you intend to submit searches from arbitrary locations, such as public WiFi hotspots, then you must enable Mascot security before opening up IP address filtering for HTTP to ‘the world’. You will also need to enable Mascot security if there are multiple users of the system who need to be allocated different levels of access, such as administrator or power user or guest.

Command line or remote desktop access to the server is rarely required, and if a malicious stranger gains such access, you have lost control of your server. Even if you enable Mascot security and open up HTTP access to ‘the world’, you should never do this for SSH/Windows remote desktop access. For more information on AWS security, see CIDR format. For example, a single IP address would be entered as ’83.217.111.202/32′, while ’83.217.111.0/24′ would represent all addresses in the range ’83.217.111.0′ to ’83.217.111.255′. The easiest way to discover your external IP address is to visit a web page such as What Is My IP?. If you suspect that your LAN uses a range of IP addresses, you’ll need to ask your IT support group for details.

To modify the firewall settings:

  • From the AWS Management Console, choose EC2 (Elastic Compute Cloud).
  • Under Network & security, select Security groups.
  • Select the security group for your instance.
  • Usually, outbound rules can be unrestricted. That is, all ports to all destinations allowed.
  • Only selected inbound ports should be open.
    • TCP Port 80 needs to be open to access the Mascot Server web pages.
    • TCP Port 443 needs to be open if you want to use HTTPS. The AMI does not have HTTPS enabled, so this is something you would need to configure separately in the web server settings.
    • TCP Port 22 needs to be open on a Linux server for SSH, but the source should be very restricted (system administrators only).
    • TCP Port 3389 needs to be open on a Windows server for Windows remote desktop, but the source should be very restricted (system administrators only).

SSH connection to a Linux server

A Linux Mascot Server instance is created with password-based remote login and logging in as root both disabled. You can change this, but initially you must login using your Amazon EC2 key pair as ec2-user. No password is set for root, so you can use sudo as required.

From a Linux client, you will probably connect from a shell prompt using SSH. From a Windows client, PuTTY is free, convenient and well documented.

You can also connect from some web browsers using a Java client.

  • From the AWS Management Console, choose EC2 (Elastic Compute Cloud).
  • Under Instances, select Instances.
  • Select the Mascot Server instance by means of a checkbox.
  • From the Actions menu, choose Connect.

Remote desktop connection to a Windows server

  • From the AWS Management Console, choose EC2 (Elastic Compute Cloud).
  • Under Instances, select Instances.
  • Select the Mascot Server instance by means of a checkbox.
  • From the Actions menu, choose Connect.
  • Choose Get password.
  • Choose the the key pair specified when the instance was created.
  • Choose Decrypt password to get the plain text Administrator password.
  • There is also an option to download a shortcut file for the Remote Desktop connection.
  • On the first connection, when asked whether you want the PC to be discoverable, choose No unless this is a Mascot cluster search node on a private subnet.

Terminating the server

If you no longer require your Mascot Server, and wish to delete all components:

  • From the AWS Management Console (the orange cube at the top left), choose CloudFormation (under Deployment and Management).
  • Choose US East (N. Virginia) from the drop-down list in the top menu bar, next to your user name.
  • Select the Mascot Server stack.
  • Choose Delete Stack and confirm.
  • When deletion is complete, from the AWS Management Console, choose EC2 (Elastic Compute Cloud).
  • Under Elastic Block Store, Choose Volumes.
  • Select the volume by means of a checkbox.
  • From the Actions menu, choose Delete Volumes and confirm.

AWS Mascot Cluster configuration

Mascot cluster mode is supported for both Linux and Windows, but you cannot have a mixture of the two operating systems in a single cluster.

The most cost-effective instances for Mascot Servers are M5, both for standalone and cluster mode. In theory, an m5.24xlarge instance with 96 vCPUs would be sufficient for an 12-CPU Mascot Server licence, but this would leave no cores free for reports and other processes. So, it’s best to switch to cluster mode for 8 CPU or more. There are many possible combinations. For example: a 16-CPU licence could comprise 2 x m5.16xlarge, which are configured as 8-CPU search nodes, and one m5.8xlarge, acting as the master or head node.

The CloudFormation template creates a Mascot Server instance within a Placement group. You should create the search nodes within the same placement group so as to ensure the best network latency and throughput for communication between the nodes. Note that communication between nodes should use private IP addresses. Traffic on public IP addresses is limited to 5 Gbps at the time of writing.

The VPC created by the CloudFormation template is 10.20.28.0/22. Within this VPC, the address range 10.20.28.0/24 is a ‘public’ subnet with an Internet gateway. The cluster nodes can go into the same subnet, which would allow them to be accessed directly from the Internet, or they can be placed in a private subnet, as illustrated here. This is mainly a security matter, and is discussed in the Amazon VPC User Guide.

We recommend creating a private subnet for the cluster nodes that is not accessible from the public Internet. To create a private subnet:

  • From the AWS Management Console (the orange cube at the top left), choose VPC (under Networking & Content delivery).
  • Choose US East (N. Virginia) from the drop-down list in the top menu bar, next to your user name.
  • Choose Subnets, Create Subnet.
  • Name tag: Best to assign a name, e.g. ‘Search nodes’.
  • VPC: Select the Mascot VPC created by the CloudFormation template.
  • Availability zone: Select the same zone as existing Mascot Server public subnet (shown in the subnet listing behind the pop-up dialog).
  • IPv4 CIDR block: Can be any address range within the VPC that is outside the public subnet, e.g. 10.20.29.0/2.

AWS Mascot Cluster using Linux

1. Create the Mascot Server master node using the CloudFormation procedure.

2. Create new key pair for communication between the cluster nodes so that your private key for communication with the master doesn’t have to be copied to the master. We’ll refer to this new key pair as the ‘cluster key pair’.

3. Create search node instance(s) using Amazon Linux AMI.

  • From the AWS Management Console, choose EC2 (Elastic Compute Cloud).
  • Choose US East (N. Virginia) from the drop-down list in the top menu bar, next to your user name.
  • Choose Launch instance.
  • Select Amazon Linux 2 (HVM), SSD Volume Type.
  • Select an M5 instance type with the required number of vCPUs and choose Next: Configure instance details.
  • Number of instances: If you will be creating multiple, identical search nodes, you can create them all at once.
  • Network: Select the Mascot VPC.
  • Subnet: If you created a private subnet, select this.
  • Placement group: Select the one created for the Mascot Server instance.
  • Choose Next: Add storage.
  • Increase the root volume to something suitable for the full set of sequence databases you intend to configure. In most cases, 500 GB will be sufficient.
  • Choose Next: Add tags and assign a name and any other tags you require.
  • Choose Next: Configure security group.
  • Create a new security group allowing All TCP inbound from 10.20.28.0/22, but nothing else. For an instance within a private subnet of a VPC and without a public IP address, these settings are ‘belt and braces’.
  • Choose Review and launch and, if everything looks OK, Launch.
  • In the Select key pair dialog, select the cluster key pair.

4. For each search node, enable SSH from the master as root.

  • Connect to the master using SSH.
  • Copy the private cluster key to /root/.ssh/id_rsa, chown to root:root and chmod 600.
  • Copy the private cluster key to /home/ec2-user/.ssh/id_rsa, chown to ec2-user:ec2-user and chmod 600.
  • SSH from the master to the search node using the private IP address (e.g. 10.20.29.160).
  • On search node:
    • sudo vi /etc/ssh/sshd_config.
    • Change PermitRootLogin forced-commands-only to PermitRootLogin without-password and save.
    • sudo service sshd reload.
    • sudo vi /root/.ssh/authorized_keys.
    • Remove text from the beginning of the file so that it starts with ssh-rsa and save.
  • Exit back to the master.
  • ssh root@10.20.29.160 (or whatever) to verify that you can log into the search node as root.
  • Exit the SSH session.

5. If the master node is also a search node, you need to enable the master to SSH to itself. Otherwise, skip this step.

  • sudo vi /root/.ssh/authorized_keys.
  • Remove text from the beginning of the file so that it starts with ssh-rsa.
  • Copy the public cluster key from authorized_keys on a search node, paste it on a new line after the last key, and save.
  • ssh root@10.20.28.195 (or whatever) to verify that you can log into the master node as root.
  • Exit the SSH session.

6. Re-configure Mascot from single server mode to cluster mode. Refer to Chapter 11 of the Mascot Setup & Installation manual for details; the following are just the essentials.

  • Kill ms-monitor.exe.
  • Edit the cluster section of mascot.dat.
    • change Enabled to 1.
    • leave MasterComputerName set to the private IP address of the master.
  • Create a nodelist.txt file using the private IP addresses of the search nodes in both the IP address and host name fields.
  • If the master node is also a search node, you can use a link to avoid having to copy the sequence database files to the search node directory structure. Something like sudo ln -s /usr/local/mascot/sequence /usr/local/mascot/searchnode.
  • Start ms-monitor.exe (must use sudo and be in the Mascot bin directory).

AWS Mascot Cluster using Windows

1. Create the Mascot Server master node using the CloudFormation procedure.

2. Create search node instance(s) using an Amazon Windows AMI.

  • From the AWS Management Console, choose EC2 (Elastic Compute Cloud).
  • Choose US East (N. Virginia) from the drop-down list in the top menu bar, next to your user name.
  • Choose Launch instance.
  • Select Microsoft Windows Server 2022 Base.
  • Select an M5 instance type with the required number of vCPUs and choose Next: Configure instance details.
  • Number of instances: If you will be creating multiple, identical search nodes, you can create them all at once.
  • Network: Select the Mascot VPC.
  • Subnet: Select the private subnet.
  • Placement group: Select the one created for the Mascot Server instance.
  • Choose Next: Add storage.
  • Increase the root volume to something suitable for the full set of sequence databases you intend to configure. In most cases, 500 GB will be sufficient.
  • Choose Next: Add tags and assign a name and any other tags you require.
  • Choose Next: Configure security group.
  • Create a new security group allowing All TCP inbound from 10.20.28.0/22 and All ICMP-IPv4 inbound from 10.20.28.0/22.
  • Choose Review and launch and, if everything looks OK, Launch.
  • In the Select key pair dialog, select the same key pair as used for the master node.

3. Update the computer name in the Windows registry.

  • Open a Remote Desktop connection to the master node.
  • Open a command window on the desktop.
  • In the command window, type regedit then press enter.
    • Locate and select the following registry subkey: HKEY_LOCAL_MACHINE\SOFTWARE\MatrixScience\Mascot\Installer\Properties.
    • Right-click MASCOT_WEB_URL, and then click Modify.
    • In the Value data box, change the hostname from EC2AMAZ-818TUQJ to the public IP address.
    • Exit Registry Editor.

4. Add the public IP address to the security group.

  • From the AWS Management Console, choose EC2 (Elastic Compute Cloud).
  • Choose US East (N. Virginia) from the drop-down list in the top menu bar, next to your user name.
  • Under Network & security, select Security groups.
  • Select the security group for your instance.
  • Add a rule to allow inbound HTTP (TCP port 80) from the public IP address of the master.

5. All dedicated search nodes require the following changes before Mascot can be switched to cluster mode. This does not include the master, even if this is also a search node. Refer to Chapter 11 of the Mascot Setup & Installation manual for additional details.

  • Open a Remote Desktop connection to the master node.
  • Open Remote Desktop connections from the master to each search node in turn. Initially, you will have to decrypt and use a different Administrator password for each node. Important: on the first connection to a dedicated search node, when asked whether you want the PC to be discoverable, choose Yes.
  • Turn off Windows firewall or configure it as described in the Chapter 2 of the manual.
  • Open a command prompt (cmd.exe) on the desktop.
    • Enter net user Administrator "new_password" where new_password is the Administrator password for the master.
    • Type regedit then press enter.
    • Locate and select the following registry subkey: HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\Policies\System.
    • On the Edit menu, point to New, and then click DWORD Value.
    • Type LocalAccountTokenFilterPolicy, and then press ENTER.
    • Right-click LocalAccountTokenFilterPolicy, and then click Modify.
    • In the Value data box, type 1, and then click OK.
    • Exit Registry Editor.

6. On the master node, switch Mascot from single server mode to cluster mode.

  • Control panel; Programs & Features.
  • Select Mascot Server and choose change.
  • In the wizard, choose Next then Change.
  • Next to display the Cluster configuration tab.
  • Check enable cluster mode.
  • Configure the cluster as described in the manual, Chapter 11.
  • The mascotnode directory goes on drive C. Enter a UNC path like \\10.20.29.37\c$\mascotnode.
  • Once the installation is complete, change the Matrix Science Mascot Service to run under the Administrator account as described in the manual.
  • Start the Mascot Monitor Service.