Matrix Science header

Using the toolkit from Perl, Java, Python and C#
[Getting started with Mascot Parser]

On this page:

The matrix_science namespace in Perl, Java, Python and C#

All Mascot Parser classes are defined in the matrix_science namespace. When you use Mascot Parser from Perl, Java or Python, these objects are available in a package or class.

Perl

All classes are available in the msparser package. For example, the class matrix_science::ms_mascotresfile is accessible as msparser::ms_mascotresfile.

Java

All classes are available in the matrix_science.msparser package. For example, the class matrix_science::ms_mascotresfile is accessible as matrix_science.ms_parser.ms_mascotresfile.

Python

All classes are available in the msparser class. For example, the class matrix_science::ms_mascotresfile is accessible as msparser.ms_mascotresfile.

C#
All classes are available in the matrix_science.msparser namespace. For example, the class matrix_science::ms_mascotresfile is accessible as matrix_science.msparser.ms_mascotresfile.

Using enumerated values and static const ints in Perl, Java, Python and C#

Using enumerated values and static const ints in Perl, Java and Python

In the documentation for ms_mascotresfile, the following 'enum' definition is used:

    enum section { 
      SEC_PARAMETERS, 
      SEC_HEADER

These values are available in the following way in the class they are defined in:

Perl
    $msparser::ms_mascotresfile::SEC_PARAMETERS 

Java
    ms_mascotresfile.SEC_PARAMETERS 

Python
    msparser.ms_mascotresfile.SEC_PARAMETERS 

Using enumerated values in C#

In C#, enum values are wrapped as proper C# enums. The enum values are the available in the following way in the class and enumeration they are defined in:

C#
    ms_mascotresfile.section.SEC_PARAMETERS

To use enumeration values as function parameters in C#, you will need to cast the enum value to the required parameter type, usually an int or uint. For example, to use the ms_mascotresfile.willCreateCache method, the flags parameter needs to be cast to a uint :

    string cachefile;
    bool will_create_cache = ms_mascotresfile.willCreateCache(
        "F981123.dat",
        (uint) matrix_science.msparser.ms_mascotresfile.FLAGS.RESFILE_USE_CACHE,
        new matrix_science.msparser.ms_mascotoptions().getCacheDirectory(),
        out cachefile
    );

Type equivalence between C++ and Perl, Java, Python and C#

Some values are converted automatically between C++ and the calling code. If a function returns, say, std::string, this is equivalent to a native string object.

In short:

C++PerlJavaPythonC#
std::string
const std::string &
char *
const char *
ordinary Perl string String non-Unicode string object string
int
long
float
double
ordinary number int
long
float
double
ordinary number object int
long
float
double
bool ordinary number (0 means false, any other number means true) boolean Bool bool
std::string & usually a return value; see Multiple return values in Perl, Java, Python and C# StringBuffer usually a return value; see Multiple return values in Perl, Java, Python and C# out string
int & int[] out int
unsigned int & long[] out uint

Some keywords have a different interpretation:

C++PerlJavaPythonC#
const can be ignored roughly equivalent to final but see special case const std::string & can be ignored roughly equivalent to C# const but see special case const std::string &
inline
virtual
can be ignored

Passing objects to functions in Perl, Java, Python and C#

In Perl, Java, Python and C# all objects are references. The same is not true in C++, where a variable can point to the object in three different ways. All three ways are collated transparently into references, which means you do not need to worry about it. Objects can be passed to C++ functions as they are.

You can recognise the three different ways easily. The three following lines all declare a function that takes an object as a parameter:

    void func1(ms_mascotresfile resfile);
    void func2(ms_mascotresfile &resfile);
    void func3(ms_mascotresfile *resfile);

These three functions would all take an object reference in Perl, Java and Python:

Perl
    my $resfile = msparser::ms_mascotresfile->new(...);
    func1($resfile);
    func2($resfile);
    func3($resfile);

Java
    ms_mascotresfile resfile = new ms_mascotresfile(...);
    func1(resfile);
    func2(resfile);
    func3(resfile);

Python
    resfile = msparser.ms_mascotresfile(...)
    func1(resfile)
    func2(resfile)
    func3(resfile)

C#
    ms_mascotresfile resfile = new ms_mascotresfile(...);
    func1(resfile);
    func2(resfile);
    func3(resfile);

For example, take a function such as ms_fileutilities::getLastModificationTime():

    static time_t getLastModificationTime(const char *filename, ms_errs *err=NULL);

The function expects an ms_errs object as a parameter. You can simply pass an object reference:

Perl
    my $errs = msparser::ms_errs->new();
    my $time = msparser::ms_fileutilities::getLastModificationTime($filename, $errs);

    if (!$errs->isValid()) {
        ...
    }

Java
    ms_errs errs = new ms_errs();
    int time = ms_fileutilities.getLastModificationTime(filename, errs);

    if (!errs.isValid()) {
        ...
    }

Python
    errs = msparser.ms_errs()
    time = msparser.ms_fileutilities.getLastModificationTime(filename, errs)

    if not errs.isValid() :
        ...

C#
    ms_errs errs = new ms_errs();
    long time = ms_fileutilities.getLastModificationTime(filename, errs);

    if (!errs.isValid()) {
        ...
    }

Using STL vector classes vectori, vectord and VectorString in Perl, Java, Python and C#

Some functions take or return arrays of values. In C++, these are called vectors, and each vector can only store values of one particular type. std::vector is the "Standard Template Library" (STL) class for vectors. std::vector<int> can only store integer values, while std::vector<std::string> can only store string values. An example function that both returns a vector and takes vectors as arguments is getAllProteinsWithThisPepMatch(). When calling such functions, you must create a vector object of the correct type:

Perl
    my $vectorOfInts    = new msparser::vectori;       # For std::vector<int>
    my $vectorOfLongs   = new msparser::vectorl;       # For std::vector<long>
    my $vectorOfDoubles = new msparser::vectord;       # For std::vector<double>
    my $vectorOfBools   = new msparser::vectorb;       # For std::vector<bool>
    my $vectorOfStrings = new msparser::VectorString;  # For std::vector<std::string>

Java and C#
    vectori vectorOfInts    = new vectori();           # For std::vector<int>
    vectorl vectorOfLongs   = new vectorl();           # For std::vector<long>
    vectord vectorOfDoubles = new vectord();           # For std::vector<double>
    vectorb vectorOfBools   = new vectorb();           # For std::vector<bool>
    VectorString vectorOfStrings = new VectorString(); # For std::vector<std::string>

Python
    vectorOfInts    = msparser.vectori()      # For std::vector<int>
    vectorOfLongs   = msparser.vectorl()      # For std::vector<long>
    vectorOfDoubles = msparser.vectord()      # For std::vector<double>
    vectorOfBools   = msparser.vectorb()      # For std::vector<bool>
    vectorOfStrings = msparser.VectorString() # For std::vector<std::string>

The different vector classes share a common interface, as shown next.

Perl

Vectors are similar to Perl arrays, but with two differences: appending and modifying items is more restricted, and vectors are strictly typed. Strict typing means that if you try to append an integer to VectorString, for example, you will get a runtime exception. See Catching C++ exceptions in Perl, Java, Python and C#.

    # Create a vector of size 0. (In this example, we'll use integer vectors.)
    my $vector = new msparser::vectori;       
    # Create a vector of any size, say 20. Items will be initialised to undef.
    my $vector2 = new msparser::vectori(20);

    # Append a value at the end of the vector. 
    $vector->push(100);
    $vector->push(200);

    # Get the number of items in the vector. Vector indices run from 0
    # to $size - 1.
    my $size = $vector->size();

    # Get item at any index. If the index is out of range (negative or
    # greater than $size - 1), an exception will be thrown. 
    my $val = $vector->get(1);

    # Set the value of item at a given index. After this call, $vector
    # is [200, 200].
    $vector->set(0, 200);

    # Remove the last item of the vector and return its value. If 
    # $vector->size() is zero, this will throw an exception.
    $val = $vector->pop();

    # Predicate testing whether the vector is empty (equivalent to
    # testing $vector->size == 0).
    if ($vector->empty) { ... }

    # Iterate over all items in the vector.
    for my $i (0 .. $vector->size-1) {
        print $vector->get($i), "\n";
    }

    # Clear the vector, i.e. remove all of its elements.
    $vector->clear();

Java

STL vectors are similar to Java Vectors; std::vector<int> is analogous to Vector<int>. However, there is no compile-time type checking, which means that if you try to append values to the vector that are not of the correct type, a runtime exception will be thrown. See Catching C++ exceptions in Perl, Java, Python and C#.

    // Create a vector of size 0. (In this example, we'll use integer vectors.)
    vectori vector = new vectori();
    // Create a vector of any size, say 20. Items will be initialised to null.
    vectori vector2 = new vectori(20);

    // Append a value at the end of the vector. 
    vector.push(100);
    vector.push(200);

    // Get the number of items in the vector. Vector indices run from 0
    // to size - 1.
    int size = vector.size();

    // Get item at any index. If the index is out of range (negative or
    // greater than size - 1), an exception will be thrown. 
    int val = vector.get(1);

    // Set the value of item at a given index. After this call, 'vector'
    // contains values [200, 200].
    vector.set(0, 200);

    // Remove the last item of the vector and return its value. If 
    // vector.size() is zero, this will throw an exception.
    int val = vector.pop();

    // Predicate testing whether the vector is empty (equivalent to
    // testing vector.size() == 0).
    if (vector.empty()) { ... }

    // Iterate over all items in the vector.
    for (int i = 0; i != vector.size(); i++)
        System.out.println(vector.get(i));

    // Clear the vector, i.e. remove all of its elements.
    vector.clear();

Python

Vectors are similar to Python arrays, but with two differences: appending and modifying items is more restricted, and vectors are strictly typed. Strict typing means that if you try to append an integer to VectorString, you will get a runtime exception. See Catching C++ exceptions in Perl, Java, Python and C#.

    # Create a vector of size 0. (In this example, we'll use integer vectors.)
    vector = msparser.vectori() 
    # Create a vector of any size, say 20. Items will be initialised to None.
    vector2 = msparser.vectori(20)

    # Append a value at the end of the vector. If the value is not of
    # the correct type (integer for vectori, string for VectorString, etc.),
    # an exception will be thrown. See below how to catch it.
    vector.append(100)
    vector.append(200)

    # Get the number of items in the vector. Vector indices run from 0
    # to size - 1.
    size = len(vector)

    # Get item at any index. If the index is out of range (negative or
    # greater than size - 1), an exception will be thrown. 
    val = vector[1]

    # Set the value of item at a given index. After this call, vector
    # is [200, 200]. If the index is out of range, an exception will be
    # thrown.
    vector[0] = 200

    # Remove the last item of the vector and return its value. If 
    # vector.size() is zero, this will throw an exception.
    val = vector.pop_back()

    # Predicate testing whether the vector is empty (equivalent to
    # testing len(vector) == 0).
    if vector.empty()
        ...

    # Iterate over all items in the vector.
    for item in vector:
        print(item)

    # Clear the vector, i.e. remove all of its elements.
    vector.clear()

C#

STL vectors are similar to C# Lists; std::vector<int> is analogous to List<int>. However, there is no compile-time type checking, which means that if you try to append values to the vector that are not of the correct type, a runtime exception will be thrown. See Catching C++ exceptions in Perl, Java, Python and C#.

    // Create a vector of size 0. (In this example, we'll use integer vectors.)
    vectori vector = new vectori();
    // Create a vector of any size, say 20. Items will be initialised to null.
    vectori vector2 = new vectori(20);

    // Append a value at the end of the vector. 
    vector.Add(100);
    vector.Add(200);

    // Get the number of items in the vector. Vector indices run from 0
    // to size - 1.
    int size = vector.Count;

    // Get item at any index. If the index is out of range (negative or
    // greater than size - 1), an exception will be thrown. 
    int val = vector[1];

    // Set the value of item at a given index. After this call, 'vector'
    // contains values [200, 200].
    vector[0] = 200;

    // Remove the last item of the vector and return its value. If 
    // vector.size() is zero, this will throw an exception.
    val = vector[size-1];
    vector.RemoveAt(size-1);

    // Predicate testing whether the vector is empty (equivalent to
    // testing vector.size() == 0).
    if (vector.Count == 0) {  }

    // Iterate over all items in the vector.
    for (int i = 0; i != vector.Count; i++)
        Console.WriteLine(vector[i]);

    // Clear the vector, i.e. remove all of its elements.
    vector.Clear();

Here is an example how to call a method that takes vectors of multiple types as arguments.

Perl
    my $start = new msparser::vectori;
    my $end   = new msparser::vectori;
    my $pre   = new msparser::VectorString;
    my $post  = new msparser::VectorString;
    my $frame = new msparser::vectori;
    my $multiplicity = new msparser::vectori;
    my $db    = new msparser::vectori;
    my $accessions = $pepsum->getAllProteinsWithThisPepMatch(
        1, 1, $start, $end, $pre, $post, $frame, $multiplicity, $db
    );

In the API documentation, the method getAllProteinsWithThisPepMatch() returns a VectorString. However, any vector returned from a function is automatically converted into an array reference in Perl.

You can access the elements of each vector by using the get() method, and calling size() returns the number of elements in the vector. For instance:

    for my $i (0 .. $multiplicity->size()-1) {
        print $multiplicity->get($i), "\n";
    }

    # This assumes $db->size() == scalar(@$accessions).
    for my $i (0 .. $#$accessions) {
        print $db->get($i), '::', $$accessions[$i], "\n";
    }

It is easy to convert STL vectors into Perl arrays:

    sub stl_to_array { [ map { $_[0]->get($_) } 0 .. $_[0]->size-1 ] }

    # Usage (assuming $db is a vectori object as above):
    my $db_arr = stl_to_array($db);

    # And now the loop is more symmetrical:
    for my $i (0 .. $#$accessions) {
        print $$db_arr[$i], '::', $$accessions[$i], "\n";
    }

Java
    vectori      start = new vectori();
    vectori      end   = new vectori();
    VectorString pre   = new VectorString();
    VectorString post  = new VectorString();
    vectori      frame = new vectori();
    vectori      multiplicity = new vectori();
    vectori      db    = new vectori();
    VectorString accessions = pepsum.getAllProteinsWithThisPepMatch(
        1, 1, start, end, pre, post, frame, multiplicity, db
    );

You can access the elements of each vector by using the get() method, and calling size() returns the number of elements in the vector. For instance:

    for (int i = 0; i != multiplicity.size(); i++) 
        System.out.println(multiplicity.get(i));

    // This assumes db.size() == accessions.size().
    for (int i = 0; i != accessions.size(); i++)
        System.out.println(new String(db.get(i)) + "::" + accessions.get(i));

Python
    start = msparser.vectori()
    end   = msparser.vectori()
    pre   = msparser.VectorString()
    post  = msparser.VectorString()
    frame = msparser.vectori()
    multiplicity = msparser.vectori()
    db    = msparser.vectori()
    accessions = pepsum.getAllProteinsWithThisPepMatch(
        1, 1, start, end, pre, post, frame, multiplicity, db
    )

You can then iterate over the items in each vector:

    for m in multiplicity:
        print(m)

    # This assumes db.size() == accessions.size().
    for i in range(accessions.size()):
        print("%d :: %s" % (db[i], accessions[i]))

C#
    vectori      start = new vectori();
    vectori      end   = new vectori();
    VectorString pre   = new VectorString();
    VectorString post  = new VectorString();
    vectori      frame = new vectori();
    vectori      multiplicity = new vectori();
    vectori      db    = new vectori();
    VectorString accessions = pepsum.getAllProteinsWithThisPepMatch(
        1, 1, start, end, pre, post, frame, multiplicity, db
    );

You can then iterate over the items in each vector, and accessing the Count parameter returns the number of elements in the vector. For instance:

    for (int i = 0; i != multiplicity.Count; i++) {
        Console.WriteLine(multiplicity[i]);
    }
    // This assumes db.Count == accessions.Count.
    for (int i = 0; i != accessions.Count; i++) {
        Console.WriteLine("{0}::{1}", db[i], accessions[i]);
    }

Catching C++ exceptions in Perl, Java, Python and C#

Exceptions from within Parser can be caught using exception handling mechanisms in the native language, as shown below. Exceptions will only be thrown by STL classes; Mascot Parser handles errors internally (see Error Handling).

Perl
    my $vectori = new msparser::vectori;

    # This will always throw an exception for an empty vector.
    eval { $vectori->pop() };

    if ($@) {
        print $@;
    }

Java
    vectori vector = new msparser.vectori();

    // This will always throw an exception for an empty vector.
    try { 
        vector.pop(); 
    } catch (Exception e) {
        System.out.println(e);
    }

Python
    vector = msparser.vectori()

    # This will always throw an exception for an empty vector.
    try:
        vector.pop_back()
    except Exception as e:
        print(e)

C#
    vectori vector = new vectori();
    try {
        // this will always throw an exception for an empty vector.
        vector.RemoveAt(0);
    } catch (Exception e) {
        Console.Error.WriteLine(e);
    }

Default parameters in Perl, Java, Python and C#

In C++, functions take a fixed number of arguments. However, some of these arguments can have default values. For example, the getDB() function has the following declaration:

    std::string getDB(int idx = 1) 

Whenever you see an equals sign (=) in the function documentation, the parameter next to the sign has a default value. This means that you can leave out the parameter when calling the constructor or method:

Perl

Assume $params is of type ms_searchparams.

    my $db = $params->getDB();   # Equivalent to $params->getDB(1)
    my $db2 = $params->getDB(2); # Override default parameter

    # This will always print 'yes'.
    if ($params->getDB eq $params->getDB(1)) {
        print "yes\n";
    }

Java

Assume params is of class ms_searchparams.

    int db = params.getDB();   // Equivalent to params.getDB(1)
    int db2 = params.getDB(2); // Override default parameter

    // This will always print 'yes'.
    if (params.getDB().equals(params.getDB(1))) {
        System.out.println("yes");
    }

Python

Assume params is of type ms_searchparams.

    db = params.getDB()   # Equivalent to params.getDB(1)
    db2 = params.getDB(2) # Override default parameter

    # This will always print 'yes'.
    if params.getDB() == params.getDB(1):
        print("yes")

C#

Assume _params is of class ms_searchparams.

    int db = _params.getDB();   // Equivalent to _params.getDB(1)
    int db2 = _params.getDB(2); // Override default parameter

    // This will always print 'yes'.
    if (_params.getDB().equals(_params.getDB(1))) {
        Console.WriteLine("yes");
    }

Static functions in Perl, Java, Python and C#

Some functions in Mascot Parser are defined at class level rather than object level. In C++, C# and Java, these are called static functions, while in Perl and Python they are called class methods. Static functions can be used directly without creating an object of the class, as the following example shows:

Perl
    my ($will_create_cache, $cachefile) = msparser::ms_mascotresfile::willCreateCache(
        "F981123.dat",
        $msparser::ms_mascotresfile::RESFILE_USE_CACHE,
        msparser::ms_mascotoptions->new->getCacheDirectory(),
    );

    if ($will_create_cache) {
        print "ms_mascotresfile will use $cachefile as the cache file.\n";
    } else {
        print "ms_mascotresfile will not use the cache.\n";
    }

Java
    String[] cachefile;
    boolean will_create_cache = ms_mascotresfile.willCreateCache(
        "F981123.dat",
        msparser.ms_mascotresfile.RESFILE_USE_CACHE,
        new msparser.ms_mascotoptions().getCacheDirectory(),
        cachefile
    );

    if (will_create_cache) {
        System.out.println(
            "ms_mascotresfile will use " + cachefile[0] + " as the cache file."
        );
    } else {
        System.out.println(
            "ms_mascotresfile will not use the cache."
        );
    }

Python
    will_create_cache, cachefile = msparser.ms_mascotresfile.willCreateCache(
        "F981123.dat",
        msparser.ms_mascotresfile.RESFILE_USE_CACHE,
        msparser.ms_mascotoptions().getCacheDirectory(),
        )

    if will_create_cache :
        print("ms_mascotresfile will use %s as the cache file." % cachefile)
    else :
        print("ms_mascotresfile will not use the cache.")

C#
    string cachefile;
    bool will_create_cache = ms_mascotresfile.willCreateCache(
        "F981123.dat",
        (uint) matrix_science.msparser.ms_mascotresfile.FLAGS.RESFILE_USE_CACHE,
        new matrix_science.msparser.ms_mascotoptions().getCacheDirectory(),
        out cachefile
    );

    if (will_create_cache) {
        Console.WriteLine(
            "ms_mascotresfile will use {0} as the cache file.",
        cachefile);
    } else {
        Console.WriteLine(
            "ms_mascotresfile will not use the cache."
        );
    }

Object initialising functions in Perl, Java, Python and C#

Some functions in Mascot Parser are used to initialise an object. These are not constructors, but rather take an object reference as an argument, whose member fields are then filled in.

In C++, these "object initialising functions" take a pointer or reference argument. In Perl, Java, C# and Python, you can simply pass an object reference. For example, ms_mascotresfile::getQuantitation() (which is used in the example below) takes an ms_quant_configfile object, while staticGetPercolatorFileNames() takes two vectors as arguments and fills them in.

Perl
    my $resfile = msparser::ms_mascotresfile->new($filename);
    my $qf = msparser::ms_quant_configfile->new();
    $qf->setSchemaFileName(
        "http://www.matrixscience.com/xmlns/schema/quantitation_2"
        . " ../html/xmlns/schema/quantitation_2/quantitation_2.xsd"
        . " http://www.matrixscience.com/xmlns/schema/quantitation_1"
        . " ../html/xmlns/schema/quantitation_1/quantitation_1.xsd"
    );

    if ($resfile->getQuantitation($qf)) {
        print "Quantitation name: ", $qf->getMethodByNumber(0)->getName(), "\n";
    } else {
        print "No quantitation section\n";
    }

Java
    ms_mascotresfile resfile = new ms_mascotresfile(filename);
    ms_quant_configfile qf = new ms_quant_configfile();
    qf.setSchemaFileName(
        "http://www.matrixscience.com/xmlns/schema/quantitation_2"
        + " ../html/xmlns/schema/quantitation_2/quantitation_2.xsd"
        + " http://www.matrixscience.com/xmlns/schema/quantitation_1"
        + " ../html/xmlns/schema/quantitation_1/quantitation_1.xsd"
    );

    if (resfile.getQuantitation(qf)) 
        System.out.println("Quantitation name: " + qf.getMethodByNumber(0).getName());
    else
        System.out.println("No quantitation section");

Python
    resfile = msparser.ms_mascotresfile($filename)
    qf = msparser.ms_quant_configfile()

    qf.setSchemaFileName(
        "http://www.matrixscience.com/xmlns/schema/quantitation_2"
        + " ../html/xmlns/schema/quantitation_2/quantitation_2.xsd"
        + " http://www.matrixscience.com/xmlns/schema/quantitation_1"
        + " ../html/xmlns/schema/quantitation_1/quantitation_1.xsd"
        )

    if resfile.getQuantitation(qf) :
        print("Quantitation name: %s" % qf.getMethodByNumber(0).getName())
    else :
        print("No quantitation section")

C#
    ms_mascotresfile resfile = new ms_mascotresfile(filename);
    ms_quant_configfile qf = new ms_quant_configfile();
    qf.setSchemaFileName(
        "http://www.matrixscience.com/xmlns/schema/quantitation_2"
        + " ../html/xmlns/schema/quantitation_2/quantitation_2.xsd"
        + " http://www.matrixscience.com/xmlns/schema/quantitation_1"
        + " ../html/xmlns/schema/quantitation_1/quantitation_1.xsd"
    );

    if (resfile.getQuantitation(qf)) 
        Console.WriteLine("Quantitation name: " + qf.getMethodByNumber(0).getName());
    else
        Console.WriteLine("No quantitation section");

Multiple return values in Perl, Java, Python and C#

Some functions in Mascot Parser return multiple values. In C++, this is usually handled by returning values in pointer arguments. In the API documentation, parameters to these kinds of functions are marked either as "in" or "out": "in" means the parameter is read by the function, and "out" means the function returns a value in the parameter. If the function parameter has no "in" or "out", it is assumed to be "in" by default.

In Perl, "out" values are returned as a list, while in Python the values are return as a tuple. In C# you mark the "out" parameters with the out keyword. In Java, however, you must use arrays of length 1 as the following example shows:

Perl

In Perl, a function can return a list of values. This means that a Mascot Parser function may also return multiple values. There are only a few functions in Mascot Parser that do so; these are documented in the API. An example is get_ms_mascotresults_params().

    my ($scriptName, 
        $flags, 
        $minProbability, 
        $maxHitsToReport, 
        $ignoreIonsScoreBelow, 
        $minPepLenInPepSummary, 
        $usePeptideSummary, 
        $flags2
    ) = $resfile->get_ms_mascotresults_params($datfile->getMascotOptions);

The returning in a list only applies to native types, for example integers, doubles and strings. Objects passed to the function documented by [in,out] should be passed as normal to the function.

Java

Note that std::string & parameters require a StringBuffer and the value in the StringBuffer is modified by the Mascot Parser function. There is no need to follow the instructions below for these parameters.

In Java, methods can only have one return value. If a Mascot Parser method returns multiple native values via pointers, you must use arrays of length 1 as parameters to that function. The return values will then be at the zeroth index of the arrays. For example, get_ms_mascotresults_params() returns multiple values:

    int[]     flags = {0};
    double[]  minProbability = {0};
    int[]     maxHitsToReport = {0};
    double[]  ignoreIonsScoreBelow = {0};
    int[]     minPepLenInPepSummary = {0};
    boolean[] usePeptideSummary = {false};
    int[]     flags2 = {0};

    String scriptName = resfile.get_ms_mascotresults_params(
        datfile.getMascotOptions(),
        flags,
        minProbability,
        maxHitsToReport,
        ignoreIonsScoreBelow,
        minPepLenInPepSummary,
        usePeptideSummary,
        flags2
    );

    // Results in flags[0], minProbability[0], ...

Python

In Python, functions can return multiple values in tuples. The same applies to Mascot Parser functions that return multiple values; an example is get_ms_mascotresults_params().

    (scriptName, 
     flags, 
     minProbability, 
     maxHitsToReport, 
     ignoreIonsScoreBelow, 
     minPepLenInPepSummary, 
     usePeptideSummary, 
     flags2) = resfile.get_ms_mascotresults_params(datfile.getMascotOptions())

C#

In C#, methods can have multiple return parameters, specified using the out keyword. Therefore, any parameter documented as an "out" parameter in the Mascot Parser API must be preceded by the out keyword when calling the method from C#. For example, get_ms_mascotresults_params() returns multiple values:

    uint flags, flags2, minpeplen;
    int maxhits;
    double minprob, iisb;
    bool usePepsum;
    string scriptname = resfile.get_ms_mascotresults_params(
        datfile.getMascotOptions(), 
        out flags, 
        out minprob, 
        out maxhits, 
        out iisb, 
        out minpeplen, 
        out usePepsum, 
        out flags2
    );
    
    // results in flags, minprob ...

With C#, every out parameter must be supplied, even if the documentation states that there is a default value.

Maintaining object references: two rules of thumb

Innocent-looking programs that crash

Consider the following three programs in Perl, Java, and Python, respectively. (Error handling is omitted for clarity, but adding it would not fix the problem.)

Perl
    #!/usr/local/bin/perl
    use strict;
    use msparser;

    sub get_params {
        my $resfile = msparser::ms_mascotresfile->new($_[0]);
        return $resfile->params;                 # PROBLEM HERE
    }

    my $params = get_params($ARGV[0]);
    print $params->getNumberOfDatabases, "\n";   # CRASH HERE

Java
    import matrix_science.msparser.*;

    public class example {
        static {
            try { 
                System.loadLibrary("msparserj");
            } catch (UnsatisfiedLinkError e) { 
                System.exit(0); 
            }
        }

        private static get_params(String filename) {
            ms_mascotresfile resfile = new ms_mascotresfile(filename);
            return resfile.params();    // PROBLEM HERE
        }

        public static void main(String argv[]) {
            ms_searchparams params = get_params(argv[0]);
            System.gc();                // See below why these are needed
            System.runFinalization();   // to trigger the crash.
            System.out.println(params.getNumberOfDatabases()); // CRASH HERE
        }
    }

Python
    #!/usr/bin/python
    import msparser
    import sys

    def get_params(filename):
        resfile = msparser.ms_mascotresfile(filename)
        return resfile.params()           # PROBLEM HERE

    params = get_params(sys.argv[1])
    print(params.getNumberOfDatabases())   # CRASH HERE

All of the programs crash at the end of the main program while trying to print the number of databases. (Go ahead and try!) In this case, the params() method of the resfile object returns an object that contains an internal reference to the parent resfile object. At the end of get_params(), resfile goes out of scope. The program will then crash when methods of the params object are accessed, because resfile is no longer in scope.

The same problem can be demonstrated in C#:

C#
    using System;
    using matrix_science.msparser;
    class GarbageCollectionExample
    {
        private static ms_peptidesummary loadPeptideSummary(string filename)
        {
            ms_mascotresfile resfile = new ms_mascotresfile(filename);
            ms_datfile datfile = new ms_datfile("../config/mascot.dat");    

            ms_mascotoptions opts = new ms_mascotoptions();

            uint flags, flags2, minpeplen;
            int maxhits;
            double minprob, iisb;
            bool usePepsum;
            resfile.get_ms_mascotresults_params(opts, out flags, out minprob, 
                out maxhits, out iisb, out minpeplen, out usePepsum, out flags2);

            return new ms_peptidesummary(resfile, flags, minprob, 
                maxhits, "", iisb, (int)minpeplen, "", flags2); // PROBLEM HERE            
        }
    
        public static void Main(string[] argv)
        {
            ms_peptidesummary pepsum = loadPeptideSummary(argv[0]);
            for (int i = 1; i <= pepsum.getNumberOfHits(); i++)
            {
                ms_protein hit = pepsum.getHit(i);
                for (int e = 1; e <= hit.getNumPeptides(); e++)
                {
                    int q = hit.getPeptideQuery(e), p = hit.getPeptideP(e);
                    ms_peptide peptide = pepsum.getPeptide(q, p);
                    Console.WriteLine(peptide.getPeptideStr());   // CRASH HERE
                }
            }
        }
        
    }

This program will crash within a few iterations when calling peptide.getPeptideStr() . Again, the problem is that the ms_peptidesummary object contains an internal reference to an ms_mascotresfile object which goes out of scope at the end of loadPeptideSummary.

If you are interested in gritty details, see Garbage collection problems (advanced reading).

Note:
The Java and C# examples have an explicit call to the system garbage collector. Without those statements, the example program may work just fine most of the time, and then once in a blue moon crash unexpectedly. This is because the Java and C# garbage collector are run at unpredictable times, not necessarily at the end of scope.
Note:
Do also note that the following is safe in Python, because if, for and while do not create a new lexical scope, but just inherit either the global or function (local) scope:
    #!/usr/bin/python
    import msparser
    import sys

    if sys.argv[1]:
        resfile = msparser.ms_mascotresfile(sys.argv[1])
        params = resfile.params()           

    if params:
        print(params.getNumberOfDatabases())  # No crash here.

Two rules of thumb

There are two easy rules of thumb to avoid crashes such as above. In the following, suppose that $b contains an object of class B, and $a contains an object of class A.

  1. When the C++ function of class B takes a pointer or a reference as an argument, and you pass it $a, you must keep a reference to $a for as long as you use $b.
  2. When the C++ function of class B returns a pointer or a reference as an argument, and you store it in $a, you must keep a reference to $b for as long as you use $a.

There is one exception:

How do you recognise a C++ pointer or reference in the API documentation?

  ms_protein* ms_mascotresults::getHit(const int hit);
  ms_inputquery::ms_inputquery(const ms_mascotresfile &resfile, const int q);

The first line defines a function getHit that returns a pointer. The second line defines a function (actually a constructor) ms_inputquery that takes a references as a parameter. So, be on a lookout for ampersands (&) and asterisks (*) when reading the API documentation.

In practice, the rules of thumb are easily followed by keeping the objects together in a hash, dictionary or array until they are no longer needed, or by creating a wrapper class that holds references to all affected objects. In short programs and scripts, you may wish to just make ms_mascotresfile, ms_proteinsummary and ms_peptidesummary objects global instead of lexically scoped, or wrap them in singleton classes.

Examples of the rules of thumb

The following examples are written in Python, as Python code is syntactically close to natural language, and can serve as an effective pseudocode. Statements are terminated with semicolons to make the code more familiar for Perl, Java and C# programmers.

    resfile = msparser.ms_mascotresfile(filename);
    params = resfile.params();
    q1 = msparser.ms_inputquery(resfile, 1);

Rule of thumb 1: keep a reference to resfile for as long as you use params.

Rule of thumb 2: keep a reference to resfile for as long as you use q1.

    resfile = msparser.ms_mascotresfile(filename);
    pepsum = msparser.ms_peptidesummary(resfile);
    hit = pepsum.getHit(1);

Rule of thumb 1: keep a reference to pepsum for as long as you use hit.

Rule of thumb 2: keep a reference to resfile for as long as you use pepsum.

    datfile = msparser.ms_datfile();
    dbs = datfile.getDatabases();

Rule of thumb 1: keep a reference to datfile for as long as you use dbs.

    qf = msparser.ms_quant_configfile();
    success = resfile.getQuantitation(qf);

Rule of thumb 2: keep a reference to resfile for as long as you use qf.


Copyright © 2022 Matrix Science Ltd.  All Rights Reserved. Generated on Thu Mar 31 2022 01:12:30