UniMod
 
 
Unimod Help
Introduction
Database fields
Adding or modifying
Naming guidelines
Symbols & mass values
Cross-link entries
Downloads
 
 

Cross-link entries

Cross-links are named with the prefix Xlink:, e.g. Xlink:DST. Most database search engines cannot handle intact cross-links, unless they happen to be 'intra' to a single peptide. Many Unimod entries are for monolinks, also known as dead-ends, which can be treated as conventional modifications.

Cleavable links, especially CID cleavable, are becoming more widely used. Software designed to handle these requires information for both the intact link and its related cleavage products. To support new functionality without breaking existing applications, all the information for a given cross-link is contained in a single 'master' entry for the intact link. For example, look at the entry for Xlink:DSSO, a CID cleavable cross-link.

DSSO

The modification composition is the delta of the intact cross-link, H(6) C(6) O(3) S. There are two specificities, K and Protein N-term, although only K is shown above. An empty neutral loss, with mass 0, corresponds to the intact cross-link. There are three neutral losses for monolinks where the free end is quenched by water, ammonia, or tris. The water quenched, free acid, is a neutral loss of H(-2) O(-1). When this is subtracted from the modification composition it adds water to give the monolink composition H(8) C(6) O(4) S and a mass of 176.0143.

Conventional behaviour is that a Unimod neutral loss only applies to the MS/MS fragments. If there are multiple neutral losses, it means peaks for any or all may be observed. The new behaviour is that, when the classification is a cross-link, each neutral loss is subtracted from the modification composition to create a new modification, representing the intact link or a link fragment or a monolink. That is, the neutral losses represent different moieties and apply to both precursor and fragments.

In the master entry for DSSO, there are three further neutral losses that correspond to CID cleavage of the link. The link cleaves asymmetrically into alkene and sulfenic acid fragments. The sulfenic acid fragment can then lose water to give a thiol fragment. For cross-link aware software, the compositions of these fragments can be obtained by subtracting the neutral loss compositions from the modification composition.

The master entry also encodes how these fragments can be paired. An arbitrary single-letter code is assigned to each fragment. In this case, A for alkene, S for sulfenic acid, and T for thiol. The 'Pairs with' field contains the partner codes, e.g. A can pair with both S and T but S and T can only pair with A. This is important because an application may screen MS2 spectra looking for these so-called signature mass pairs.

To support applications that are not aware of this new behaviour, there are separate, simple Unimod entries for each modification composition. For example, Xlink:DSSO[158] for the intact link and Xlink:DSSO[54] for the alkene fragment. The number in square brackets is the mass of the delta, rounded to the nearest integer.

The same approach applies to other types of cleavable linker. For example, Xlink:DTBP is the master entry for a chemically cleavable linker containing a disulfide bridge, and contains neutral losses corresponding to the intact and cleaved links. Plus, there are separate entries for standard applications: Xlink:DTBP[172] and Xlink:DTBP[87].

For software developers

For most cross-links, there is a master entry in Unimod that contains all the information needed by a cross-link aware application, e.g. Xlink:DSSO. If a specificity classification is one of the four cross-link classifications listed below, each neutral loss composition is subtracted from the modification composition to create the composition of the intact link or a link fragment or a monolink. For all other specificity classifications, the existing behaviour still applies, and a NeutralLoss element defines a shift in the MS/MS fragment peaks.

There are separate Unimod entries to enable standard software to calculate the masses of intact links, link fragments, and monolinks, e.g. Xlink:DSSO[176]. Cross-link aware software can simply ignore these entries, which are easily identified by a name that begins Xlink: and ends with a closing square bracket.

The schema for the Unimod xml file, unimod_2_xsd, required a minor version update from 2.0 to 2.1 to add three new attributes to the NeutralLoss element, which are all optional and type xs:string:

  • description
  • code
  • pairs_with

Four new classifications were added:

  • Cross-link
  • CID cleavable cross-link
  • Photo cleavable cross-link
  • Other cleavable cross-link

In theory, existing applications should not break because of a minor version update. But, to be on the safe side, the unimod.xml file available on the download page does not contain the new, 'master' cross-link entries, and will validate against the earlier 2.0 schema. If you are developing cross-link aware software, a second file, unimod_xl.xml that includes the master entries can be downloaded here.

The forms on this web site don't provide a user interface to all elements of the schema. For example, there is a PepNeutralLoss element as well as a NeutralLoss element. The NeutralLoss element is for losses from MS/MS fragments while the PepNeutralLoss element is for losses from the precursor. Either could have been used for the cross-link information, which applies to both MS/MS fragments and precursor. If you create a master entry by editing unimod.xml in a text editor or by using an interface that allows the choice between PepNeutralLoss and NeutralLoss, make sure you target the correct element.