Matrix Science header

Error tolerant searches
[Mascot results file module]

There are two types of error tolerant search and these are described at the Matrix Science website. In Mascot Parser documentation, these are referred to as the Error tolerant search and the Integrated error tolerant search. This section assumes familiarity with both types.

Original error tolerant search

In Mascot Server 1.8 and later, an error tolerant search may be run as a repeat search. In this case, one or more ACCESSIONs will have been specified, and the results file will just contain the error tolerant search results. The accessions in the search may be retrieved using ms_searchparams::getACCESSION().

In the peptide summary report, peptide matches are ignored if

Significance level is taken from the parameter ignoreIonsScoreBelow given to the ms_peptidesummary constructor. If this value is zero, then a default threshold of 1 in 20 is used.

The filename for the 'parent' non-error-tolerant search is saved in the parameters section as _errortolerantsearchparent, and can be accessed with ms_searchparams::getErrTolParentFilename(). It is not possible to use getINTERMEDIATE() because an error tolerant search can be performed as a repeat search from another error tolerant search. If MSRES_SHOW_ALL_FROM_ERR_TOL is not specified, and the file specified by _errortolerantsearchparent cannot be found, then the error ERR_NO_ERR_TOL_PARENT will be set, and the results will be shown as if MSRES_SHOW_ALL_FROM_ERR_TOL had been specified.

Always use standard protein grouping (MSRES_GROUP_PROTEINS) with manual error tolerant searches. If you enable protein clustering, the parent file is ignored and the results reported may be incorrect.

Integrated error tolerant search

In Mascot Server 2.2 and later, a single search can be performed which contains both the standard search results and the error tolerant search results. This is known as the integrated error tolerant search.

For an integrated error tolerant search, if MSRES_ERR_TOL is specified, then the results will just be taken from the error tolerant sections of the results file, and these results are handled in exactly the same way as for the Original error tolerant search. If MSRES_INTEGRATED_ERR_TOL is specifed, then the results will contain matches from both the standard and error tolerant sections. If neither of these flags are specified, then the results will just be derived from the standard peptides section.

When MSRES_INTEGRATED_ERR_TOL is specified, the results are first combined at the query level, so there will be up to 20 matches for each query. The methods ms_peptide::getRank() and ms_protein::getPeptideP() will therefore return a number in the range 1 to 20. This rank value can be used for any ms_peptidesummary method that requires a 'p' (rank) value.

An error tolerant match will be discarded if it has a score below the average identity threshold (getAvePeptideIdentityThreshold()) or below the maximum standard result for the query. This means that, in practice, it will be rare to get 20 matches for a particular query -- the requirement would be for all 10 error tolerant matches to be the top 10 scores, and all would need to be above the average peptide identity threshold. To show all matches, including those that would be discarded by default, the flag MSRES_SHOW_ALL_FROM_ERR_TOL needs to be used when constructing the ms_peptidesummary object.

The average identity threshold is calculated using the minProbability value passed to the ms_peptidesummary constructor. If a value of <= 0 or >= 0.1 is passed to the constructor, then a default of 1 in 20 is assumed for calculating the threshold. The flag MSRES_MAXHITS_OVERRIDES_MINPROB should normally be used so that the maxHitsToReport value is not overridden by the minProbability value. Use the method ms_peptide::getIsFromErrorTolerant() to find out if the results are from the standard results section or the error tolerant section.

Protein score is derived from the highest scoring non-error-tolerant match for each query, and this value can be found by calling ms_protein::getPeptideIonsScore().

See also ms_mascotresfile::getNumEtSeqsSearched().

Useful functions for both type of search

To determine if a search is an error tolerant search, use ms_mascotresfile::isErrorTolerant().

The following functions can be used from both peptide and protein summary to get information about error tolerant modifications or residue substitutions.

Copyright © 2016 Matrix Science Ltd.  All Rights Reserved. Generated on Fri Jun 2 2017 01:44:51