This class encapsulates the input data in the mascot results file. More...
#include <ms_inputquery.hpp>
Public Member Functions | |
ms_inputquery (const ms_mascotresfile &resfile, const int q) | |
Use this contructor to create an object to get the input data. | |
std::string | getCharge () const |
Returns a string that represents the supplied charge for the query. | |
std::string | getComp (int comp_no) const |
Returns a string that represents one of the sequence queries. | |
int | getIndex () const |
Returns the zero based index of the peak list in an MGF, PKL or DTA file. | |
std::string | getINSTRUMENT (const bool unescaped=true) const |
Returns an INSTRUMENT string if this has been specified at the query level. | |
double | getIntMax () const |
Returns the maximum intensity of any ion. | |
double | getIntMin () const |
Returns the minimum intensity of any ion. | |
double | getIonMobility () const |
Returns the ion mobility of any ion. | |
std::string | getIT_MODS (const bool unescaped=true) const |
Returns the names of variable modifications if such have been specified at the query level. | |
std::string | getLocalVarModName (const int num) const |
Returns the query-level modification name by 1-based index. | |
std::string | getLocus () const |
Returns the locus specified in the MGF file using the LOCUS= line. | |
double | getMassMax () const |
Returns the maximum mass of any ion. | |
double | getMassMin () const |
Returns the minimum mass of any ion. | |
double | getMaxInternalMass () const |
Returns the maximum mass to be considered for internal fragments if an INSTRUMENT has been specified at the query level. | |
double | getMinInternalMass () const |
Returns the minimum mass to be considered for internal fragemnts if an INSTRUMENT has been specified at the query level. | |
int | getNumberOfLocalVarMods () const |
Returns the name of query-level variable modifications. | |
int | getNumberOfPeaks (const int ions) |
Returns the number of ions peaks in an ms-ms spectrum. | |
int | getNumUsed () const |
Returns the number of ions used for matching - value no longer available. | |
int | getNumVals () const |
Returns the number of ions. | |
double | getPeakIntensity (const int ions, const int peakNo) |
Returns the intensity of a particular ions peak. | |
std::vector< std::pair< double, double > > | getPeakList (const int ions) |
Returns a list of ions peaks for a query. | |
double | getPeakMass (const int ions, const int peakNo) |
Returns the mass of a particular ions peak. | |
double | getPepTol () const |
Returns the peptide tolerance for this query. | |
std::string | getPepTolString () const |
Returns the peptol string . | |
std::string | getPepTolUnits () const |
Returns the peptide tolerance units for this query. | |
std::string | getRawfile () const |
Returns the rawfile specified in the MGF file using the RAWFILE= line. | |
void | getRawScans (std::vector< int > &index, std::vector< std::string > &rawscans) const |
Returns all rawscan numbers that were used to generate this peak list. | |
std::string | getRawScans (const int rawFileIdx=-1) const |
Returns the rawscan number(s) that were used to generate this peak list. | |
std::string | getRetentionTimes (const int rawFileIdx=-1) const |
Returns the retention time(s) in seconds of the scans that were used to generate this peak list. | |
void | getRetentionTimes (std::vector< int > &index, std::vector< std::string > ×) const |
Returns all retention times in seconds of the scans that were used to generate this peak list. | |
std::string | getRULES () const |
Returns the instrument rules if this has been specified at the query level. | |
void | getScanNumbers (std::vector< int > &index, std::vector< std::string > &scans) const |
Returns all scan numbers that were used to generate this peak list. | |
std::string | getScanNumbers (const int rawFileIdx=-1) const |
Returns the scan number(s) that were used to generate this peak list. | |
std::string | getSeq (int seq_no) const |
Returns a string that represents one of the sequence queries.. | |
std::string | getStringIons1 (bool includeCharges=false) |
Returns the peak list for the ions as a string. | |
std::string | getStringIons2 (bool includeCharges=false) |
Returns the peak list for the ions as a string. | |
std::string | getStringIons3 (bool includeCharges=false) |
Returns the peak list for the ions as a string. | |
std::string | getStringTitle (bool unescaped) const |
Returns the title for the ms-ms data (if any). | |
std::string | getTag (int tag_no) const |
Returns a string that represents one of the sequence tag queries. | |
double | getTotalIonsIntensity () |
Returns the sum of all the ions intensities. |
This class encapsulates the input data in the mascot results file.
Although all these parameters could be obtained using the lower level functions such as ms_mascotresfile::getQuerySectionValue it is generally more convenient to use this object.
ms_inputquery | ( | const ms_mascotresfile & | resfile, |
const int | q | ||
) |
Use this contructor to create an object to get the input data.
Constructor for ms_inputquery.
See Maintaining object references: two rules of thumb.
resfile | Must be a valid ms_mascotresfile object. |
q | is the query number and must be in the range 1 to ms_mascotresfile::getNumQueries(). |
std::string getCharge | ( | ) | const |
Returns a string that represents the supplied charge for the query.
This value will only be present if a charge was specified at the query level in the input (MGF, PKL, mzData, DTA) file. It will normally be one of the following strings:
The charge can also be any comma separated list of charge states.
The default charge states are available from ms_searchparams::getCHARGE. Use ms_peptide::getCharge to return the actual charge state that was used for a peptide match.
std::string getComp | ( | int | comp_no ) | const |
Returns a string that represents one of the sequence queries.
Up to 20 sets of composition data can be entered for each peptide mass.
comp_no | is an index for the comp() command. The input data is available as COMP1 , COMP2 COMP3 etc. To get all composition data, call this function with numbers from 1 to 20. |
comp()
command as a string int getIndex | ( | ) | const |
Returns the zero based index of the peak list in an MGF, PKL or DTA file.
In Mascot 2.3 and later, the index of each spectrum is saved as an index=
value in each query section. This can be useful, because the query numbers are not the same as the index into the text file, and if there are non-unique titles, then it is hard to get back from the query to the original spectrum in the MGF, PKL or DTA file
The index is stored in mzIdentML files using the CV term:
[Term] id: MS:1000774 name: multiple peak list nativeID format def: "index=xsd:nonNegativeInteger" [PSI:MS] comment: Used for conversion of peak list files with multiple spectra, i.e. MGF, PKL, merged DTA files. Index is the spectrum number in the file, starting from 0. is_a: MS:1000767 ! native spectrum identifier format
std::string getINSTRUMENT | ( | const bool | unescaped = true ) |
const |
Returns an INSTRUMENT
string if this has been specified at the query level.
This string will always be empty for versions of Mascot prior to 2.2.
A value is only returned when INSTRUMENT
is specified in the MGF file. For example:
BEGIN IONS PEPMASS=1234 INSTRUMENT=ETD_TRAP 50 1 100 3 400 5 END IONS
The value is obtained from the INSTRUMENT=
line in the relevant query section of the results file. If there is no instrument specified at the query level, then the INSTRUMENT
specified in the search form is used. This can be retrieved using ms_searchparams::getINSTRUMENT().
See also ms_inputquery::getRULES(). It is preferable to use the rules rather than this name to determine which ions series were searched since the user can change the definition of the instrument name.
unescaped | has to be TRUE for human-readable instrument name and FALSE for generating input for a mascot search. |
INSTRUMENT
string, or an empty string if no instrument was specified at the query level. double getIntMax | ( | ) | const |
Returns the maximum intensity of any ion.
ions(b- ...)
, ions(y-...)
and ions(...)
, this function returns the maximum intensity of any ion. double getIntMin | ( | ) | const |
Returns the minimum intensity of any ion.
ions(b- ...)
, ions(y-...)
and ions(...)
, this function returns the minimum intensity of any ion. double getIonMobility | ( | ) | const |
Returns the ion mobility of any ion.
std::string getIT_MODS | ( | const bool | unescaped = true ) |
const |
Returns the names of variable modifications if such have been specified at the query level.
This string will always be empty for versions of Mascot prior to 2.2.
A value is only returned when variable modifications are specified in the MGF file. For example:
BEGIN IONS PEPMASS=1234 IT_MODS=Phospho (STY) 50 1 100 3 400 5 END IONS
The value is obtained from the IT_MODS=
line in the relevant query section of the results file. If there are no modifications specified at the query level, then the IT_MODS
specified in the search parameters are used. These can be retrieved using ms_searchparams::getIT_MODS().
unescaped | has to be TRUE for human readable modification names and FALSE for generating input for a mascot search. |
std::string getLocalVarModName | ( | const int | num ) | const |
Returns the query-level modification name by 1-based index.
This method returns unescaped modification names as returned by getIT_MODS(), but accessible with a 1-based numerical index. You can produce the same vector of modification titles by splitting the return value of getIT_MODS(true) on comma ",".
num | Index of the query-level modification between 1 and getNumberOfLocalVarMods(). |
std::string getLocus | ( | ) | const |
Returns the locus specified in the MGF file using the LOCUS= line.
LOCUS
is an optional attribute for an MS-MS peak list in an MGF file.
LOCUS was introduced in Mascot 2.4
locus=
value in the query section. double getMassMax | ( | ) | const |
Returns the maximum mass of any ion.
ions(b- ...)
, ions(y-...)
and ions(...)
, this function returns the maximum mass of any ion. double getMassMin | ( | ) | const |
Returns the minimum mass of any ion.
ions(b- ...)
, ions(y-...)
and ions(...)
, this function returns the minimum mass of any ion. double getMaxInternalMass | ( | ) | const |
Returns the maximum mass to be considered for internal fragments if an INSTRUMENT
has been specified at the query level.
Introduced in Mascot 2.2, the values for minimum and maximum internal masses are specified in the fragmentation_rules file as
minInternalMass 0.0 maxInternalMass 700.0
The value specified for the instrument is saved in the parameters section of the results file as INTERNALS=min,max
. If there is no INTERNALS=
line, the default value of 700.0 is returned. A different instrument can be specified for each MS-MS spectrum, in which case the values are returned using this function and ms_inputquery::getMinInternalMass
double getMinInternalMass | ( | ) | const |
Returns the minimum mass to be considered for internal fragemnts if an INSTRUMENT
has been specified at the query level.
Introduced in Mascot 2.2, the values for minimum and maximum internal masses are specified in the fragmentation_rules file as
minInternalMass 0.0 maxInternalMass 700.0
The value specified for the instrument is saved in the parameters section of the results file as INTERNALS=min,max
. A different instrument can be specified for each MS-MS spectrum, in which case the values are returned using this function and ms_inputquery::getMaxInternalMass.
int getNumberOfLocalVarMods | ( | ) | const |
Returns the name of query-level variable modifications.
int getNumberOfPeaks | ( | const int | ions ) |
Returns the number of ions peaks in an ms-ms spectrum.
The 'ions' value will normally be '1'. For ms-ms data entered via the sequence query, it is possible to specify that values are part of a b-series, y-series or either series -- see http://www.matrixscience.com/help/sq_help.html#IONS.
If all three types of ions are entered, then the separate values will be saved as ions1
, ions2
and ions3
in the results file. For data uploaded from a file, only ions1
values will be found.
The ions1=
value is read from file 'on demand', so execution of this function can take a while. The data is cached on the first call of one of the following functions:
ions | will be a value of 1..3. |
int getNumUsed | ( | ) | const |
Returns the number of ions used for matching - value no longer available.
This value is always saved as '-1' in the results files, so this function always returns -1. To find the number of ions peaks used for scoring, see ms_peptide::getPeaksUsedFromIons1, ms_peptide::getPeaksUsedFromIons2 and ms_peptide::getPeaksUsedFromIons3.
int getNumVals | ( | ) | const |
Returns the number of ions.
This is retrieved from the num_vals=
line of the query section.
Even if ions are entered as ions(b- ...)
, ions(y-...)
and ions(...)
, this function returns the number of all ions.
double getPeakIntensity | ( | const int | ions, |
const int | peakNo | ||
) |
Returns the intensity of a particular ions peak.
The 'ions' value will normally be '1'. For ms-ms data entered via the sequence query, it is possible to specify that values are part of a b-series, y-series or either series -- see http://www.matrixscience.com/help/sq_help.html#IONS.
If all three types of ions are entered, then the separate values will be saved as ions1
, ions2
and ions3
in the results file. For data uploaded from a file, only ions1
values will be found.
The ions1=
value is read from file 'on demand', so execution of this function can take a while. The data is cached on the first call of one of the following functions:
ions | will be a value of 1..3. |
peakNo | should be in the range 1.. getNumberOfPeaks(). |
std::vector< std::pair< double, double > > getPeakList | ( | const int | ions ) |
Returns a list of ions peaks for a query.
This function only works from C++
For other languages, Use getNumberOfPeaks(), getPeakMass() and getPeakIntensity() instead.
The 'ions' value will normally be '1'. For MS-MS data entered via the sequence query, it is possible to specify that values are part of a b-series, y-series or either series -- see http://www.matrixscience.com/help/sq_help.html#IONS.
If all three types of ions are entered, then the separate values will be saved as ions1
, ions2
and ions3
in the results file. For data uploaded from a file, only ions1
values will be found.
The ions1=
value is read from file 'on demand', so execution of this function can take a while. The data is cached on the first call of one of the following functions:
ions | ions value, can be 1, 2 or 3 |
double getPeakMass | ( | const int | ions, |
const int | peakNo | ||
) |
Returns the mass of a particular ions peak.
The 'ions' value will normally be '1'. For ms-ms data entered via the sequence query, it is possible to specify that values are part of b-series, y-series or either series -- see http://www.matrixscience.com/help/sq_help.html#IONS.
If all three types of ions are entered, then the separate values will be saved as ions1
, ions2
and ions3
in the results file. For data uploaded from a file, only ions1
values will be found.
The ions1=
value is read from file 'on demand', so execution of this function can take a while. The data is cached on the first call of one of the following functions:
ions | will be a value of 1..3. |
peakNo | should be in the range 1.. getNumberOfPeaks(). |
double getPepTol | ( | ) | const |
Returns the peptide tolerance for this query.
peptol(tolerance,unit)
may be used to specify a mass tolerance for an individual query, over-riding the search form default. For example, peptol(10,%)
or peptol(2,Da)
.
std::string getPepTolString | ( | ) | const |
Returns the peptol string .
Returns, for example, peptol(2.01,Da)
. The number is always returned in 'fixed' notation. Trailing zeros apart from the one (if any) after the decimal are removed.
This function is intended to be used to create a sequence query string rather than for use to display in a report.
std::string getPepTolUnits | ( | ) | const |
Returns the peptide tolerance units for this query.
In the search form, the command peptol(tolerance,unit)
may be used override the default mass tolerance for an individual query. For example, peptol(10,%)
or peptol(2,Da)
.
The units will be one of:
%%
mmu
Da
ppm
Da
std::string getRawfile | ( | ) | const |
Returns the rawfile specified in the MGF file using the RAWFILE= line.
RAWFILE
is an optional attribute for an MS-MS peak list in an MGF file.
RAWFILE was introduced in Mascot 2.4.
rawfile=
value in the query section. std::string getRawScans | ( | const int | rawFileIdx = -1 ) |
const |
Returns the rawscan number(s) that were used to generate this peak list.
RAWSCANS
is an optional attribute for an MS-MS peak list in an MGF file.
Separate values for multiple raw files were introduced in Mascot 2.3. The results file may contain "rawscans" with no brackets or zero or more "rawscans[i]" lines.
If rawFileIdx is -1, this method returns the unbracketed value. Otherwise, an individual bracketed line is retrieved.
To retrieve all unbracketed and bracketed values, use ms_inputquery::getRawScans(std::vector<int>&, std::vector<std::string>&).
rawFileIdx | is the raw file index for MGF files that were created from multiple raw files. |
a[[-b][,c[-d]]]
void getRawScans | ( | std::vector< int > & | index, |
std::vector< std::string > & | rawscans | ||
) | const |
Returns all rawscan numbers that were used to generate this peak list.
RAWSCANS
is an optional attribute for an MS-MS peak list in an MGF file.
Separate values for multiple raw files were introduced in Mascot 2.3. The results file may contain "rawscans" with no brackets or zero or more "rawscans[i]" lines.
This method returns all rawscans lines, bracketed or not. The elements returned in index are the indices, and the corresponding elements in scans are the values. For example, if the query section has
rawscans=sn13
then both index and times will have one element. index[0] will be -1 (unbracketed) and rawscans[0] will be sn13. If the query section instead has
rawscans[3]=sn13 rawscans[5]=sn24
then index[0] == 3, index[1] == 5 and rawscans[0] == "sn13", rawscans[1] == "sn24".
If you already know which lines are present in the query section, you can use ms_inputquery::getRawScans(const int) to access the individual lines.
[out] | index | Vector of index numbers for corresponding "rawscans" lines. Unbracketed line has index -1, and the other indices start from 0. |
[out] | rawscans | Vector of values from the "rawscans" lines in the same order as the index vector. |
void getRetentionTimes | ( | std::vector< int > & | index, |
std::vector< std::string > & | times | ||
) | const |
Returns all retention times in seconds of the scans that were used to generate this peak list.
RTINSECONDS
is an optional attribute for an MS-MS peak list in an MGF file.
Separate values for multiple raw files were introduced in Mascot 2.3. The results file may contain "rtinseconds" with no brackets or zero or more "rtinseconds[i]" lines.
This method returns all rtinseconds lines, bracketed or not. The elements returned in index are the indices, and the corresponding elements in scans are the values. For example, if the query section has
rtinseconds=1234.0
then both index and times will have one element. index[0] will be -1 (unbracketed) and times[0] will be 1234.0. If the query section instead has
rtinseconds[3]=1234.0 rtinseconds[5]=1244.0
then index[0] == 3, index[1] == 5 and times[0] == 1234.0, times[1] == 1244.0.
If you already know which lines are present in the query section, you can use ms_inputquery::getRetentionTimes(const int) to access the individual lines.
[out] | index | Vector of index numbers for corresponding "rtinseconds" lines. Unbracketed line has index -1, and the other indices start from 0. |
[out] | times | Vector of values from the "rtinseconds" lines in the same order as the index vector. |
std::string getRetentionTimes | ( | const int | rawFileIdx = -1 ) |
const |
Returns the retention time(s) in seconds of the scans that were used to generate this peak list.
RTINSECONDS
is an optional attribute for an MS-MS peak list in an MGF file.
Separate values for multiple raw files were introduced in Mascot 2.3. The results file may contain "rtinseconds" with no brackets or zero or more "rtinseconds[i]" lines.
If rawFileIdx is -1, this method returns the unbracketed value. Otherwise, an individual bracketed line is retrieved.
To retrieve all unbracketed and bracketed values, use ms_inputquery::getRetentionTimes(std::vector<int>&, std::vector<std::string>&).
rawFileIdx | is the raw file index for MGF files that were created from multiple raw files. |
v[[-w][,x[-y]]]
. std::string getRULES | ( | ) | const |
Returns the instrument rules if this has been specified at the query level.
This string will always be empty for versions of Mascot prior to 2.2.
A value is only returned when INSTRUMENT
is specified in the MGF file. For example:
BEGIN IONS PEPMASS=1234 INSTRUMENT=ETD_TRAP 50 1 100 3 400 5 END IONS
The value is obtained from the RULES=
line in the relevant query section of the results file. If there is no instrument specified at the query level, then RULES
specified in the search form, and the rules for this query are the global rules which can be retrieved using ms_searchparams::getRULES().
It is preferable to use the rules rather than the instrument name to determine which ions series were searched since the user can change the definition of the instrument name.
std::string getScanNumbers | ( | const int | rawFileIdx = -1 ) |
const |
Returns the scan number(s) that were used to generate this peak list.
SCANS
is an optional attribute for an MS-MS peak list in an MGF file.
Separate values for multiple raw files were introduced in Mascot 2.3. The results file may contain "scans" with no brackets or zero or more "scans[i]" lines.
If rawFileIdx is -1, this method returns the unbracketed value. Otherwise, an individual bracketed line is retrieved.
To retrieve all unbracketed and bracketed values, use ms_inputquery::getScanNumbers(std::vector<int>&, std::vector<std::string>&).
rawFileIdx | is the raw file index for MGF files that were created from multiple raw files. |
a[[-b][,c[-d]]]
. void getScanNumbers | ( | std::vector< int > & | index, |
std::vector< std::string > & | scans | ||
) | const |
Returns all scan numbers that were used to generate this peak list.
SCANS
is an optional attribute for an MS-MS peak list in an MGF file.
Separate values for multiple raw files were introduced in Mascot 2.3. The results file may contain "scans" with no brackets or zero or more "scans[i]" lines.
This method returns all scans lines, bracketed or not. The elements returned in index are the indices, and the corresponding elements in scans are the values. For example, if the query section has
scans=1
then both index and scans will have one element. index[0] will be -1 (unbracketed) and scans[0] will be 1. If the query section instead has
scans[3]=13 scans[5]=26
then index[0] == 3, index[1] == 5 and scans[0] == 13, scans[1] == 26.
If you already know which lines are present in the query section, you can use ms_inputquery::getScanNumbers(const int) to access the individual lines.
[out] | index | Vector of index numbers for corresponding "scans" lines. Unbracketed line has index -1, and the other indices start from 0. |
[out] | scans | Vector of values from the "scans" lines in the same order as the index vector. |
std::string getSeq | ( | int | seq_no ) | const |
Returns a string that represents one of the sequence queries..
Up to 20 sets of sequence information can be entered for each peptide mass.
seq_no | is an index for the seq() command. The input data is available as SEQ1 , SEQ2 SEQ3 etc. To get all sequence data, call this function with numbers from 1 to 20. |
seq()
command as a string. std::string getStringIons1 | ( | bool | includeCharges = false ) |
Returns the peak list for the ions as a string.
The ions1=
value is read from file 'on demand', so execution of this function can take a while. The data is cached on the first call of one of the following functions:
includeCharges | will add any supplied charges to the returned string, so the format will be: m/z[:intensity[:charge]], m/z[:intensity[:charge]]... if no intensities were applied initially, then the string returned will be the same as if this parameter was false |
ions1=
value in the form: 136.070000:42.95,175.110000:133.1,299.060000:710.1,415.030000:144.6 . . .
std::string getStringIons2 | ( | bool | includeCharges = false ) |
Returns the peak list for the ions as a string.
The ions2=
value is read from file 'on demand', so execution of this function can take a while. The data is cached on the first call of one of the following functions:
includeCharges | will add any supplied charges to the returned string, so the format will be: m/z[:intensity[:charge]], m/z[:intensity[:charge]]... if no intensities were applied initially, then the string returned will be the same as if this parameter was false |
ions2=
value in the form: 136.070000:42.95,175.110000:133.1,299.060000:710.1,415.030000:144.6 . . .
std::string getStringIons3 | ( | bool | includeCharges = false ) |
Returns the peak list for the ions as a string.
The ions3=
value is read from file 'on demand', so execution of this function can take a while. The data is cached on the first call of one of the following functions:
includeCharges | will add any supplied charges to the returned string, so the format will be: m/z[:intensity[:charge]], m/z[:intensity[:charge]]... if no intensities were applied initially, then the string returned will be the same as if this parameter was false |
136.070000:42.95,175.110000:133.1,299.060000:710.1,415.030000:144.6 . . .
std::string getStringTitle | ( | bool | unescaped ) | const |
Returns the title for the ms-ms data (if any).
The string will have a maximum length of 30k characters. It might for example be:
1: Scan 2443 (rt=1651.36) [P:\\service\\on_campus\\Nunari\\Laura\\ft20061020.RAW]
unescaped | - If this is false, then any non alpha-numeric characters in the string will be converted to %xx where the xx are the hex representation of the character. |
std::string getTag | ( | int | tag_no ) | const |
Returns a string that represents one of the sequence tag queries.
Up to 20 tag
/etag
commands can be entered for each peptide mass.
tag_no | is an index for the tag() command. The input data is available as TAG1 , TAG2 TAG3 etc. To get all tags, call this function with numbers from 1 to 20. |
tag()
command as a string in the format e,977.4,[Q|K][Q|K][Q|K]EE,1619.7
where the first character will be 'e' for etag
and 't' for tag
. double getTotalIonsIntensity | ( | ) |
Returns the sum of all the ions intensities.
See also ms_mascotresfile::getObservedIntensity. If the precursor intensity wasn't supplied in the input file, this function can be called instead. Processing this function for a large number of spectra may take some time since it will need to load the ions1=
values from disk and sum all the intensities.
See also ms_peptide::getIonsIntensity which just calls this function.
The ions1=
value is read from file 'on demand', so execution of this function can take a while. The data is cached on the first call of one of the following functions:
Copyright © 2022 Matrix Science Ltd. All Rights Reserved. Generated on Thu Mar 31 2022 01:12:33 |