Posted by Ville Koskinen (January 18, 2019)

O-fucosylated CID spectra

O-linked fucose is easily lost in CID. A recent paper by Swearingen et al. in the Journal of Proteome Research discusses this in the context of identifying O-fucosylated thrombospondin type 1 repeats (TSRs) in Plasmodium parasites using database searching. The main problem is, the O-glycosidic bond is weaker than the peptide backbone. Collision energies typical for peptide fragmentation cause it [...]

Posted by John Cottrell (January 12, 2018)

Results round-up for the ‘dark matter’ challenge

In June, we tried to harness the power of crowd-sourcing to explain some of the unidentified modifications found in open database searches. We selected 20 abundant and unassigned mass deltas from Supplementary Table 3 of the recent MSFragger paper from Alexey Nesvizhskii’s group at U. Michigan and offered prizes for the first credible explanations. There were 35 unannotated deltas in [...]

Posted by John Cottrell (June 26, 2017)

Trying to illuminate proteomics ‘dark matter’

The May 2017 issue of Nature Methods has a paper from Alexey Nesvizhskii’s group at U. Michigan describing a new open database search program called MSFragger. Strikingly, they also observed the two highly abundant but unidentified mass deltas reported in Steven Gygi’s 2015 mass tolerant paper: 301.9864 and 249.9803. The challenges of open searching were discussed in an earlier blog [...]

Posted by John Cottrell (February 16, 2016)


David Fenyö and Ron Beavis have a short paper in J. Proteome Research that draws attention to a potential problem with peptides containing selenocysteine (1-letter code U, 3-letter code Sec). Samples are frequently alkylated, yet modified U is unlikely to be considered in the search. This need not be an issue for Mascot searches, but you may have no modifications [...]

Posted by John Cottrell (September 22, 2015)

Mass-tolerant vs Error tolerant

"A mass-tolerant database search identifies a large proportion of unassigned spectra in shotgun proteomics as modified peptides" in Nature Biotechnology is from Steven Gygi’s lab at Harvard Medical School. It describes the use of a very wide precursor mass tolerance, +/- 500 Da, to identify modified peptides in a Sequest search. How does this approach, which the authors call an [...]

Posted by Ville Koskinen (August 15, 2015)

PSI file formats, part 3: repositories

We’ve talked about mzIdentML validity only in terms of file structure. Proteomics repositories, such as PRIDE or ProteoRed, of course require files to be valid in that sense, but they impose additional requirements. If you need to upload your search results to a repository, it is worth looking at this more extended idea of validity. For simplicity, I’ll only consider [...]

Posted by John Cottrell (October 14, 2013)

Modifications round-up, part 2

This is the second of two articles dealing with topics relating to modifications. The first can be found here. Note that Site analysis was covered in an earlier article. Why aren’t amino acid substitutions listed in the search form? Amino acid substitutions are rare and there are lots of them, so the only practical way to use them is in [...]

Posted by John Cottrell (September 19, 2013)

Modifications round-up, part 1

Much of the complexity in Mascot is associated with modifications. It can be hard to find information about some of aspects of handling modifications unless you already know what you are looking for. In this blog article, the first of two, I’ll collect together some of the topics that come up frequently in support emails. Note that Site analysis was [...]

Posted by John Cottrell (April 9, 2013)

Non-standard amino acid residues

Mascot only supports the 26 letters of the Latin alphabet as one-letter codes in sequence database entries. And, it is case-insensitive, so you cannot use (say) R and r for different residues. This is quite a limitation if you want to create a custom database that encodes non-standard or modified residues. It isn’t a concern if you search only public [...]

