Training > Exercises > PMF4 answers

Exercise PMF4: answers

What are the effects of database size?

SwissProt, with or without a taxonomy filter of Saccharomyces Cerevisiae, shows a single two component mixture.

However, if you try the search against NCBInr without a taxonomy filter, the mixture will disappear. It will also vanish if you make the search less specific by opening up the mass tolerance and including a variable modification or two.

The criterion for reporting a mixture is that the increase in score from an additional component is greater than the significance threshold score. The larger the database, the higher the threshold, making it more difficult to for any mixture to be reported. The effect of using a taxonomy filter is identical to using a smaller database containing just the selected entries.

Using a mass tolerance that is too wide, or including unnecessary variable modifications has the same result. In the first case, because the score decreases. In the second, because the threshold increases.
How many components are you confident are present?

The question is a difficult one. Neither protein is a very strong match. One thing that might give confidence is that the mixture shows up in SwissProt without a taxonomy filter and both proteins are from yeast. On the other hand, one protein is 65 kDa and the other is 92 kDa, so is it likely that we would see both in the same gel spot?

If this was an important sample, it would be advisable to confirm by getting MS/MS spectra for some of the peptides
Does the search engine report a choice of mixtures? If so, what does this mean?
Probably not in this particular case, but getting a choice of mixtures is not unusual. The databases contain many homologous proteins, so there may be different ways of pairing them off that give similar scores.

Return to exercise page

Matrix Science

Exercise PMF4: answers