Matrix Science header

Quickstart: how to open a results file
[Getting started with Mascot Parser]

Prelude: linking or importing Mascot Parser

Before Mascot Parser can be used in your program, it must be linked in or imported. Details depend on the programming language.

C++

Details how to link Mascot Parser statically or dynamically to your program depend on the operating system, compiler and linker. Please refer to C++ toolkit installation on Windows and C++ toolkit installation on Unix.

You must include msparser.hpp before using Mascot Parser classes:

    #include "msparser.hpp"

All classes live in the matrix_science namespace. You will either need to import the namespace or explicitly state the namespace before each class name:

    {
        // Import the namespace into the current scope.
        using namespace matrix_science;
        ms_mascotresfile resfile(filename);
    }

    // Or explicitly:
    matrix_science::ms_mascotresfile resfile(filename);

Perl

Before you can call Parser functions, you need to import the msparser package with use.

    use msparser;

It is also a good idea to enable strict error checking with use strict;. Additionally, if you add use warnings; or use the -w command-line flag, the Perl interpreter will warn you if you try to use constants that are not defined anywhere. This is very useful in checking for typos in names of global constants and enumerated values.

If Mascot Parser Perl libraries are not in the current directory and are not installed in a system-wide path, you must also add the directory of msparser.pm and msparser.dll or msparser.so to the module search path with use lib. For example:

    # On Windows:
    use lib 'c:\path\to\msparser\perl_files';
    use msparser;
    # On Unix:
    use lib '/path/to/msparser/perl_files';
    use msparser;

The use lib statement must precede the use msparser statement.

For more details, see Perl toolkit installation.

All Mascot Parser classes are defined in the msparser package, which you must always prefix to the actual class name. For example, to create a new ms_mascotresfile object:

    my $resfile = msparser::ms_mascotresfile->new($filename);

    # Or C++ style syntax:
    my $resfile = new msparser::ms_mascotresfile($filename);

Java

You need to use the -classpath command-line parameter with javac and java when you compile and run your program. See Building and running the Java examples in Windows and Building and running the Java examples in Unix.

Before you can use Mascot Parser classes, you need to import them from the matrix_science.msparser namespace.

    // Import everything:
    import matrix_science.msparser.*;

    // Import selectively:
    import matrix_science.msparser.ms_mascotresfile;
    import matrix_science.msparser.ms_peptidesummary;

In addition to this, the msparserj library must be loaded into your Java class. The following should be added at the top of your Java code before any class method definitions:

    public class MyClass {
        static {
            try {
                System.loadLibrary("msparserj");
            } catch (UnsatisfiedLinkError e) {
                System.err.println("Native code library failed to load. "
                                   + "Is msparserj.dll on the path?\n" + e);
                System.exit(0);
            }
        }

For more details, see Installation on Windows.

Python

Before you can use Parser classes, you need to import them from the msparser package:

    import msparser

Alternatively, you can selectively import a subset of classes:

    from msparser import msparser.ms_mascotresfile, msparser.ms_peptidesummary

If Mascot Parser Python libraries are not in the current directory and are not installed in a system-wide path, you must also add the directory of msparser.py and _msparser.dll or _msparser.so to the module search path by appending the directory to sys.path:

    # On Windows:
    import sys
    sys.path.append('C:\path\to\msparser\python_files')
    import msparser

    # On Unix:
    import sys
    sys.path.append('/path/to/msparser/python_files')
    import msparser

sys.path.append must precede the import msparser statement.

For more details, see Python toolkit installation.

All Mascot Parser classes are defined in the msparser package. For example, to create a new ms_mascotresfile object:

    resfile = msparser.ms_mascotresfile(filename)

C#

You need to use the /r:matrix_science.msparser.dll command-line option with csc.exe when you compile your program. You also need to put msparsercs.dll and matrix_science.msparser.dll in the directory containing your executable when you run your program. See Compiling examples from the command line

All Mascot Parser classes live in the matrix_science.msparser namespace. You will either need to import the namespace or explicitly state the namespace before each class name:

    {
        // Import the namespace into the current scope.
        using namespace matrix_science.msparser;
        ms_mascotresfile resfile = new ms_mascotresfile(filename);
    }

    // Or explicitly:
    matrix_science.msparser.ms_mascotresfile resfile = new matrix_science.msparser.ms_mascotresfile(filename);

For more details, see Installation on Windows.

Getting basic information from the results file

Opening a results file is very easy. Simply create an ms_mascotresfile object, passing the file name as the only parameter:

C++
    ms_mascotresfile resfile("F981123.dat");
Perl
    my $resfile = msparser::ms_mascotresfile->new("F981123.dat");
Java and C#
    ms_mascotresfile resfile = new ms_mascotresfile("F981123.dat");
Python
    resfile = msparser.ms_mascotresfile("F981123.dat")

All data in the results file is accessible using Mascot Parser classes and methods. For example, the search parameters can be found by using the params() method of the resfile object, which returns an ms_searchparams object. The following code be used to find the MODS entry:

C++
    std::string mods = resfile.params().getMODS();
Perl
    my $mods = $resfile->params->getMODS;
Java
    String mods = resfile.params().getMODS(); 
Python
    mods = resfile.params().getMODS()
C#
    string mods = resfile._params().getMODS();

It is also possible to read the raw data from the results file. For example, you can read the MODS entry from the "parameters" section directly:

C++
    std::string mods = resfile.getSectionValueStr(ms_mascotresfile::SEC_PARAMETERS, "MODS");
Perl
    my $mods = $resfile->getSectionValueStr($msparser::ms_mascotresfile::SEC_PARAMETERS, "MODS"); 
Java
    String mods = resfile.getSectionValueStr(ms_mascotresfile.SEC_PARAMETERS,"MODS"); 
Python
    mods = resfile.getSectionValueStr(msparser.ms_mascotresfile.SEC_PARAMETERS, "MODS")
C#
    string mods = resfile.getSectionValueStr(ms_mascotresfile.section.SEC_PARAMETERS, "MODS");

The ms_mascotresfile class has many more methods for reading 'raw' information from the results file. In general, it is easier, faster, and less error-prone to use objects instead of directly reading the character strings from the file. Using objects will also guard your application from changes in the underlying data format.

Getting protein information

ms_mascotresfile provides only a low-level view of the results file. You need to create an ms_proteinsummary or ms_peptidesummary object to access protein hits and peptide matches. Which one is needed depends on the type of the search results: a Protein Summary should be used with a peptide mass fingerprint search, while Peptide Summary is more convenient for MS-MS searches.

The easiest and recommended way to choose is to use get_ms_mascotresults_params(). This is a helper function of the resfile object that returns default flags and parameters to use in order to re-create the same report shown by Mascot Server. The function is called with a single argument: an ms_mascotoptions object. If you are running the program on the Mascot Server, this should be fetched from the ms_datfile object; otherwise you can simply pass an empty ms_mascotoptions object.

The following example demonstrates how to fetch the default parameters, and how to select which type of report to open. The scriptName parameter can generally be ignored outside Mascot Server.

C++
    // If running on Mascot Server:

    ms_datfile datfile("../config/mascot.dat");

    bool         usePeptideSummary;
    unsigned int flags, flags2, minPepLenInPepSummary;
    int          maxHitsToReport;
    double       minProbability, ignoreIonsScoreBelow;

    std::string scriptName = resfile.get_ms_mascotresults_params(
        datfile.getMascotOptions(),
        &flags,
        &minProbability,
        &maxHitsToReport,
        &ignoreIonsScoreBelow,
        &minPepLenInPepSummary,
        &usePeptideSummary,
        &flags2
    );

    if (usePeptideSummary) {
        ms_peptidesummary pepsum(
            resfile, flags, minProbability, maxHitsToReport, "", 
            ignoreIonsScoreBelow, minPepLenInPepSummary, "", flags2
        );

        /* Call pepsum methods... */
    } else {
        ms_proteinsummary proteinsum(
            resfile, flags, minProbability, maxHitsToReport
        );

        /* Call proteinsum methods... */
    }
    // If running outside Mascot Server:

    std::string scriptName = resfile.get_ms_mascotresults_params(
        ms_mascotoptions(),
        &flags,
        &minProbability,
        &maxHitsToReport,
        &ignoreIonsScoreBelow,
        &minPepLenInPepSummary,
        &usePeptideSummary,
        &flags2
    );

Perl
    # If running on Mascot Server:

    my $datfile = msparser::ms_datfile->new('../config/mascot.dat');
    my ($scriptName, 
        $flags, 
        $minProbability, 
        $maxHitsToReport, 
        $ignoreIonsScoreBelow, 
        $minPepLenInPepSummary, 
        $usePeptideSummary, 
        $flags2
    ) = $resfile->get_ms_mascotresults_params($datfile->getMascotOptions);

    if ($usePeptideSummary) {
        my $pepsum = msparser::ms_peptidesummary->new(
            $resfile, $flags, $minProbability, $maxHitsToReport, '', 
            $ignoreIonsScoreBelow, $minPepLenInPepSummary, '', $flags2
        );

        # Call $pepsum methods...
    } else {
        my $proteinsum = msparser::ms_proteinsummary->new(
            $resfile, $flags, $minProbability, $maxHitsToReport
        );

        # Call $proteinsum methods...
    }
    # If running outside Mascot Server:

    my ($scriptName, 
        $flags, 
        $minProbability, 
        $maxHitsToReport, 
        $ignoreIonsScoreBelow, 
        $minPepLenInPepSummary, 
        $usePeptideSummary, 
        $flags2
    ) = $resfile->get_ms_mascotresults_params(new msparser::ms_mascotoptions);

Java
    // If running on Mascot Server:

    ms_datfile datfile = new ms_datfile("../config/mascot.dat");

    int[]    flags = {0};
    double[] minProbability = {0};
    int[]    maxHitsToReport = {0};
    double[] ignoreIonsScoreBelow = {0};
    int[]    minPepLenInPepSummary = {0};
    bool[]   usePeptideSummary = {false};
    int[]    flags2 = {0};

    String scriptName = resfile.get_ms_mascotresults_params(
        datfile.getMascotOptions(),
        flags,
        minProbability,
        maxHitsToReport,
        ignoreIonsScoreBelow,
        minPepLenInPepSummary,
        usePeptideSummary,
        flags2
    );

    if (usePeptideSummary[0]) {
        ms_peptidesummary pepsum = new ms_peptidesummary(
            resfile, gpFlags[0], gpMinProbability[0], gpMaxHitsToReport[0], "", 
            gpIgnoreIonsScoreBelow[0], gpMinPepLenInPepSummary[0], "", gpFlags2[0]
        );

        /* Call pepsum methods... */
    } else {
        ms_proteinsummary proteinsum = new ms_proteinsummary(
            resfile, gpFlags[0], gpMinProbability[0], gpMaxHitsToReport[0], "", ""
        );

        /* Call proteinsum methods... */
    }
    // If running outside Mascot Server:

    String scriptName = resfile.get_ms_mascotresults_params(
        new ms_mascotoptions(),
        flags,
        minProbability,
        maxHitsToReport,
        ignoreIonsScoreBelow,
        minPepLenInPepSummary,
        usePeptideSummary,
        flags2
    );
Python
    # If running on Mascot Server:

    datfile = msparser.ms_datfile('../config/mascot.dat')
    (scriptName, 
     flags, 
     minProbability, 
     maxHitsToReport, 
     ignoreIonsScoreBelow, 
     minPepLenInPepSummary, 
     usePeptideSummary, 
     flags2) = resfile.get_ms_mascotresults_params(datfile.getMascotOptions())

    if usePeptideSummary:
        pepsum = msparser.ms_peptidesummary(
            resfile, flags, minProbability, maxHitsToReport, '', 
            ignoreIonsScoreBelow, minPepLenInPepSummary, '', flags2
            )

        # Call pepsum methods...
    else:
        proteinsum = msparser.ms_proteinsummary(
            resfile, flags, minProbability, maxHitsToReport
            )

        # Call proteinsum methods...
    # If running outside Mascot Server:

    (scriptName, 
     flags, 
     minProbability, 
     maxHitsToReport, 
     ignoreIonsScoreBelow, 
     minPepLenInPepSummary, 
     usePeptideSummary, 
     flags2) = resfile.get_ms_mascotresults_params(msparser.ms_mascotoptions())
C#
    // if running on Mascot server
    string scriptName = resfile.get_ms_mascotresults_params(
        datfile.getMascotOptions(),
        out flags,
        out minProbability,
        out maxHitsToReport,
        out ignoreIonsScoreBelow,
        out minPepLenInPepSummary,
        out usePeptideSummary,
        out flags2
    );

    if (usePeptideSummary)
    {
        ms_peptidesummary pepsum = new ms_peptidesummary(
            resfile, flags, minProbability, maxHitsToReport, "",
            ignoreIonsScoreBelow, (int) minPepLenInPepSummary, "", flags2
        );

        /* Call pepsum methods... */
    }
    else
    {
        ms_proteinsummary proteinsum = new ms_proteinsummary(
            resfile, flags, minProbability, maxHitsToReport, "", ""
        );

        /* Call proteinsum methods... */
    }
    // if running outside Mascot server
    string scriptName = resfile.get_ms_mascotresults_params(
        new ms_mascotoptions(),
        out flags,
        out minProbability,
        out maxHitsToReport,
        out ignoreIonsScoreBelow,
        out minPepLenInPepSummary,
        out usePeptideSummary,
        out flags2
    );

Once you have created the summary object, you can access protein hits by calling getHit(). Here is an example how to get the accession string of the top-scoring protein hit.

C++
    ms_protein *top_hit = pepsum.getHit(1);
    std::string acc = top_hit->getAccession();

    std::cout << acc << std::endl;

Perl
    my $top_hit = $pepsum->getHit(1);
    my $acc = $top_hit->getAccession();

    print $acc, "\n";

Java
    ms_protein top_hit = pepsum.getHit(1);
    String acc = top_hit.getAccession();

    System.out.println(acc);

Python
    top_hit = pepsum.getHit(1)
    acc = top_hit.getAccession()

    print(acc)

C#
    ms_protein top_hit = pepsum.getHit(1);
    string acc = top_hit.getAccession();
    
    Console.WriteLine(acc);

Next steps


Copyright © 2022 Matrix Science Ltd.  All Rights Reserved. Generated on Thu Mar 31 2022 01:12:30