|
CommonSubs |
MOBY::CommonSubs.pm - a set of exportable subroutines that are useful in clients and services to deal with the input/output from MOBY Services
not written yet
The following is a generalized architecture for *all* BioMOBY services showing how to parse incoming messages using the subroutines provided in CommonSubs
sub myServiceName {
my ($caller, $data) = @_;
my $MOBY_RESPONSE; # holds the response raw XML
# genericServiceInputParser
# unpacks incoming message into an array of arrarefs.
# Each element of the array is a queryInput block, or a mobyData block
# the arrayref has the following structure:
# [SIMPLE, $queryID, $simple]
# the first element is an exported constant SIMPLE, COLLECTION, SECONDARY
# the second element is the queryID (required for enumerating the responses)
# the third element is the XML::DOM for the Simple, Collection, or Parameter block
my (@inputs)= genericServiceInputParser($data);
# or fail properly with an empty response
return SOAP::Data->type('base64' => responseHeader("my.authURI.com") . responseFooter()) unless (scalar(@inputs));
# you only need to do this if you are intending to be namespace aware
# some services might not care what namespace the data is in, so long
# as there is data...
my @validNS_LSID = validateNamespaces("NCBI_gi"); # returns LSID's for each human-readable
foreach (@inputs){
my ($articleType, $qID, $input) = @{$_};
unless (($articleType == SIMPLE) && ($input)){
# in this example, we are only accepting SIMPLE types as input
# so write back an empty response block and move on to the next
$MOBY_RESPONSE .= simpleResponse("", "", $qID) ;
next;
} else {
# now take the namespace and ID from our input article
# (see pod docs for other possibilities)
my $namespace = getSimpleArticleNamespaceURI($input); # get namespace
my ($identifier) = getSimpleArticleIDs($input); # get ID (note array output! see pod)
# here is where you do whatever manipulation you need to do
# for your particular service.
# you will be building an XML document into $MOBY_RESPONSE
}
}
return SOAP::Data->type('base64' => (responseHeader("illuminae.com") . $MOBY_RESPONSE . responseFooter));
}
A COMPLETE EXAMPLE OF AN EASY MOBY SERVICE
This is a service that:
CONSUMES: base Object in the GO namespace EXECUTES: Retrieval PRODUCES: GO_Term (in the GO namespace)
# this subroutine is called from your dispatch_with line # in your SOAP daemon
sub getGoTerm {
my ($caller, $message) = @_;
my $MOBY_RESPONSE;
my (@inputs)= genericServiceInputParser($message); # ([SIMPLE, $queryID, $simple],...)
return SOAP::Data->type('base64' => responseHeader('my.authURI.com') . responseFooter()) unless (scalar(@inputs));
my @validNS = validateNamespaces("GO"); # ONLY do this if you are intending to be namespace aware!
my $dbh = _connectToGoDatabase();
return SOAP::Data->type('base64' => responseHeader('my.authURI.com') . responseFooter()) unless $dbh;
my $sth = $dbh->prepare(q{
select name, term_definition
from term, term_definition
where term.id = term_definition.term_id
and acc=?});
foreach (@inputs){
my ($articleType, $ID, $input) = @{$_};
unless ($articleType == SIMPLE){
$MOBY_RESPONSE .= simpleResponse("", "", $ID);
next;
} else {
my $ns = getSimpleArticleNamespaceURI($input);
(($MOBY_RESPONSE .= simpleResponse("", "", $ID)) && (next))
unless validateThisNamespace($ns, @validNS); # only do this if you are truly validating namespaces
my ($accession) = defined(getSimpleArticleIDs($ns, [$input]))?getSimpleArticleIDs($ns,[$input]):undef;
unless (defined($accession)){
$MOBY_RESPONSE .= simpleResponse("", "", $ID);
next;
}
unless ($accession =~/^GO:/){
$accession = "GO:$accession"; # we still haven't decided on whether id's should include the prefix...
}
$sth->execute($accession);
my ($term, $def) = $sth->fetchrow_array;
if ($term){
$MOBY_RESPONSE .= simpleResponse("
<moby:GO_Term namespace='GO' id='$accession'>
<moby:String namespace='' id='' articleName='Term'>$term</moby:String>
<moby:String namespace='' id='' articleName='Definition'>$def</moby:String>
</moby:GO_Term>", "GO_Term_From_ID", $ID)
} else {
$MOBY_RESPONSE .= simpleResponse("", "", $ID)
}
}
}
return SOAP::Data->type('base64' => (responseHeader("my.authURI.com") . $MOBY_RESPONSE . responseFooter));
}
CommonSubs are used to do various manipulations of MOBY Messages. It is useful both Client and Service side to construct and parse MOBY Messages, and ensure that the message structure is valid as per the API.
It DOES NOT connect to MOBY Central for any of its functions, though it does contact the ontology server, so it will require a network connection.
Mark Wilkinson (markw at illuminae dot com)
BioMOBY Project: http://www.biomoby.org
name : genericServiceInputParser
function : For the MOST SIMPLE SERVICES that take single Simple or Collection inputs
and no Secondaries/Parameters this routine takes the MOBY message and
breaks the objects out of it in a useful way
usage : my @inputs = genericServiceInputParser($MOBY_mssage));
args : $message - this is the SOAP payload; i.e. the XML document containing the MOBY message
returns : @inputs - the structure of @inputs is a list of listrefs.
Each listref has three components:
1. COLLECTION|SIMPLE (i.e. constants 1, 2)
2. queryID
3. $data - the data takes several forms
a. $article XML::DOM node for Simples
<mobyData...>...</mobyData>
b. \@article XML:DOM nodes for Collections
for example, the input message:
<mobyData queryID = '1'>
<Simple>
<Object namespace=blah id=blah/>
</Simple>
</mobyData>
<mobyData queryID = '2'>
<Simple>
<Object namespace=blah id=blah/>
</Simple>
</mobyData>
will become:
(note that SIMPLE, COLLECTION, and SECONDARY are exported constants from this module)
@inputs = ([SIMPLE, 1, $DOM], [SIMPLE, 2, $DOM]) # the <Simple> block
for example, the input message:
<mobyData queryID = '1'>
<Collection>
<Simple>
<Object namespace=blah id=blah/>
</Simple>
<Simple>
<Object namespace=blah id=blah/>
</Simple>
</Collection>
</mobyData>
will become:
@inputs = ( [COLLECTION, 1, [$DOM, $DOM]] ) # the <Simple> block
name : DO NOT USE!!
function : to take a MOBY message and break the objects out of it. This is identical
to the genericServiceInputParser method above, except that it returns the data as
Objects rather than XML::DOM nodes. This is an improvement!
usage : my @inputs = serviceInputParser($MOBY_mssage));
args : $message - this is the SOAP payload; i.e. the XML document containing the MOBY message
returns : @inputs - the structure of @inputs is a list of listrefs.
Each listref has three components:
1. COLLECTION|SIMPLE|SECONDARY (i.e. constants 1, 2, 3)
2. queryID (undef for Secondary parameters)
3. $data - either MOBY::Client::SimpleArticle, CollectionArticle, or SecondaryArticle
name : complexServiceInputParser
function : For more complex services that have multiple articles for each input
and/or accept parameters, this routine will take a MOBY message and
extract the Simple/Collection/Parameter objects out of it in a
useful way.
usage : my $inputs = complexServiceInputParser($MOBY_mssage));
args : $message - this is the SOAP payload; i.e. the XML document containing the MOBY message
returns : $inputs is a hashref with the following structure:
$inputs->{$queryID} = [ [TYPE, $DOM], [TYPE, $DOM], [TYPE, $DOM] ]
Simples ------------------------
for example, the input message:
<mobyData queryID = '1'>
<Simple articleName='name1'>
<Object namespace=blah id=blah/>
</Simple>
<Parameter articleName='cutoff'>
<Value>10</Value>
</Parameter>
</mobyData>
will become:
(note that SIMPLE, COLLECTION, and SECONDARY are exported constants from this module)
$inputs->{1} = [ [SIMPLE, $DOM_name1], # the <Simple> block
[SECONDARY, $DOM_cutoff] # $DOM_cutoff= <Parameter> block
]
Please see the XML::DOM pod documentation for information about how
to parse XML DOM objects.
Collections --------------------
With inputs that have collections these are presented as a
listref of Simple article DOM's. So for the following message:
<mobyData>
<Collection articleName='name1'>
<Simple>
<Object namespace=blah id=blah/>
</Simple>
<Simple>
<Object namespace=blah id=blah/>
</Simple>
</Collection>
<Parameter articleName='cutoff'>
<Value>10</Value>
</Parameter>
</mobyData>
will become
$inputs->{1} = [ [COLLECTION, [$DOM, $DOM] ], # $DOM is the <Simple> Block!
[SECONDARY, $DOM_cutoff] # $DOM_cutoff = <Parameter> Block
]
Please see the XML::DOM pod documentation for information about how
to parse XML DOM objects.
name : getArticles
function : get the Simple/Collection/Parameter articles for a single mobyData
usage : @articles = getArticles($XML)
args : raw XML or XML::DOM of a queryInput, mobyData, or queryResponse block (e.g. from getInputs)
returns : a list of listrefs; each listref is one component of the queryInput or mobyData block
a single block may consist of one or more named or unnamed
simple, collection, or parameter articles.
The listref structure is thus [name, $ARTICLE_DOM]:
e.g.: @articles = ['name1', $SIMPLE_DOM]
generated from the following sample XML:
<mobyData>
<Simple articleName='name1'>
<Object namespace=blah id=blah/>
</Simple>
</mobyData>
or : @articles = ['name1', $COLL_DOM], ['paramname1', $PARAM_DOM]
generated from the following sample XML:
<mobyData>
<Collection articleName='name1'>
<Simple>
<Object namespace=blah id=blah/>
</Simple>
<Simple>
<Object namespace=blah id=blah/>
</Simple>
</Collection>
<Parameter articleName='e value cutoff'>
<default>10</default>
</Parameter>
</mobyData>
name : getSimpleArticleIDs
function : to get the IDs of simple articles that are in the given namespace
usage : my @ids = getSimpleArticleIDs("NCBI_gi", \@SimpleArticles);
my @ids = getSimpleArticleIDs(\@SimpleArticles);
args : $Namespace - (optional) a namespace stringfrom the MOBY namespace ontology, or undef if you don't care
\@Simples - (required) a listref of Simple XML::DOM nodes
i.e. the XML::DOM representing an XML structure like this:
<Simple>
<Object namespace="NCBI_gi" id="163483"/>
</Simple>
note : If you provide a namespace, it will return *only* the ids that are in the given namespace,
but will return 'undef' for any articles in the WRONG namespace so that you get an
equivalent number of outputs to inputs.
Note that if you call this with a single argument, this is assumed to
be \@Articles, so you will get ALL id's regardless of namespace!
name : getSimpleArticleNamespaceURI
function : to get the namespace of a simple article
usage : my $ns = getSimpleArticleNamespaceURI($SimpleArticle);
args : $Simple - (required) a single XML::DOM node representing a Simple Article
i.e. the XML::DOM representing an XML structure like this:
<Simple>
<Object namespace="NCBI_gi" id="163483"/>
</Simple>
name : simpleResponse
function : wraps a simple article in the appropriate (mobyData) structure
usage : $resp .= &simpleResponse($object, 'MyArticleName', $queryID);
args : (in order)
$object - (optional) a MOBY Object as raw XML
$article - (optional) an articeName for this article
$query - (optional, but strongly recommended) the queryID value for the
mobyData block to which you are responding
notes : as required by the API you must return a response for every input.
If one of the inputs was invalid, you return a valid (empty) MOBY
response by calling &simpleResponse(undef, undef, $queryID) with no arguments.
name : collectionResponse
function : wraps a set of articles in the appropriate mobyData structure
usage : return responseHeader . &collectionResponse(\@objects, 'MyArticleName', $queryID) . responseFooter;
args : (in order)
\@objects - (optional) a listref of MOBY Objects as raw XML
$article - (optional) an articeName for this article
$queryID - (optional, but strongly recommended) the mobyData ID
to which you are responding
notes : as required by the API you must return a response for every input.
If one of the inputs was invalid, you return a valid (empty) MOBY
response by calling &collectionResponse(undef, undef, $queryID).
name : responseHeader
function : print the XML string of a MOBY response header +/- serviceNotes
usage : responseHeader('illuminae.com')
responseHeader(
-authority => 'illuminae.com',
-note => 'here is some data from the service provider')
args : a string representing the service providers authority URI,
OR a set of named arguments with the authority and the
service provision notes.
caveat :
notes : returns everything required up to the response articles themselves.
i.e. something like:
<?xml version='1.0' encoding='UTF-8'?>
<moby:MOBY xmlns:moby='http://www.biomoby.org/moby'>
<moby:Response moby:authority='http://www.illuminae.com'>
name : responseFooter
function : print the XML string of a MOBY response footer
usage : return responseHeader('illuminae.com') . $DATA . responseFooter;
notes : returns everything required after the response articles themselves
i.e. something like:
</moby:Response>
</moby:MOBY>
name : getInputs
function : get the mobyData block(s) as XML::DOM nodes
usage : @queryInputs = getInputArticles($XML)
args : the raw XML of a <MOBY> query, or an XML::DOM document
returns : a list of XML::DOM::Node's, each is a queryInput or mobyData block.
Note : Remember that these blocks are enumerated! This is what you
pass as the third argument to the simpleResponse or collectionResponse
subroutine to associate the numbered input to the numbered response
name : getInputID
function : get the value of the queryID element
usage : @queryInputs = getInputID($XML)
args : the raw XML or XML::DOM of a queryInput or mobyData block (e.g. from getInputs)
returns : integer, or ''
Note : Inputs and Responses are coordinately enumerated!
The integer you get here is what you
pass as the third argument to the simpleResponse or collectionResponse
subroutine to associate the numbered input to the numbered response
name : DO NOT USE!!
function : get the Simple/Collection articles for a single mobyData
or queryResponse node, rethrning them as SimpleArticle,
SecondaryArticle, or ServiceInstance objects
usage : @articles = getArticles($XML)
args : raw XML or XML::DOM of a moby:mobyData block
returns :
name : getCollectedSimples function : get the Simple articles collected in a moby:Collection block usage : @Simples = getCollectedSimples($XML) args : raw XML or XML::DOM of a moby:Collection block returns : a list of XML::DOM nodes, each of which is a moby:Simple block
name : getInputArticles
function : get the Simple/Collection articles for each input query, in order
usage : @queries = getInputArticles($XML)
args : the raw XML of a moby:MOBY query
returns : a list of listrefs, each listref is the input to a single query.
Remember that the input to a single query may be one or more Simple
and/or Collection articles. These are provided as XML::DOM nodes.
i.e.: @queries = ([$SIMPLE_DOM_NODE], [$SIMPLE_DOM_NODE2])
or : @queries = ([$COLLECTION_DOM_NODE], [$COLLECTION_DOM_NODE2])
the former is generated from the following XML:
...
<moby:mobyContent>
<moby:mobyData>
<Simple>
<Object namespace=blah id=blah/>
</Simple>
</moby:mobyData>
<moby:mobyData>
<Simple>
<Object namespace=blah id=blah/>
</Simple>
</moby:mobyData>
</moby:mobyContent>
...
name : isSimpleArticle
function : tests XML (text) or an XML DOM node to see if it represents a Simple article
usage : if (isSimpleArticle($node)){do something to it}
input : an XML::DOM node, an XML::DOM::Document or straight XML
returns : boolean
name : isCollectionArticle
function : tests XML (text) or an XML DOM node to see if it represents a Collection article
usage : if (isCollectionArticle($node)){do something to it}
input : an XML::DOM node, an XML::DOM::Document or straight XML
returns : boolean
name : isSecondaryArticle
function : tests XML (text) or an XML DOM node to see if it represents a Secondary article
usage : if (isSecondaryArticle($node)){do something to it}
input : an XML::DOM node, an XML::DOM::Document or straight XML
returns : boolean
name : extractRawContent
function : pass me an article (Simple, or Collection) and I'll give you the
content AS A STRING - i.e. the raw XML of the contained MOBY Object(s)
usage : extractRawContent($simple)
input : the one element of the output from getArticles
returns : string
name : getNodeContentWithArticle
function : a very flexible way to get the stringified content of a node
that has the correct element and article name
or get the value of a Parameter element.
usage : @strings = getNodeContentWithArticle($node, $tagname, $articleName)
args : (in order)
$node - an XML::DOM node, or straight XML. It may even
be the entire mobyData block.
$tagname - the tagname (effectively from the Object type ontology),
or "Parameter" if you are trying to get secondaries
$articleName - the articleName that we are searching for
returns : an array of the stringified text content for each
node that matched the tagname/articleName specified.
note that each line of content is an element of the string.
notes : This was written for the purpose of getting the values of
String, Integer, Float, Date_Time, and other such primitives.
For example, in the following XML:
...
...
<moby:mobyContent>
<moby:mobyData>
<Simple>
<Sequence namespace=blah id=blah>
<Integer namespace='' id='' articleName="Length">3</Integer>
<String namespace='' id='' articleName="SequenceString">ATG</String>
</Sequence>
</Simple>
</moby:mobyData>
</moby:mobyContent>
...
...
would be analysed as follows:
# get $input - e.g. from genericServiceInputParser or complexServiceInputParser
@sequences = getNodeContentWithArticle($input, "String", "SequenceString");
For Parameters, such as the following
...
...
<moby:mobyContent>
<moby:mobyData>
<Simple>
<Sequence namespace=blah id=blah>
<Integer namespace='' id='' articleName="Length">3</Integer>
<String namespace='' id='' articleName="SequenceString">ATG</String>
</Sequence>
</Simple>
<Parameter articleName='cutoff'>
<Value>24</Value>
</Parameter>
</moby:mobyData>
</moby:mobyContent>
...
...
You would parse it as follows:
# get $input - e.g. from genericServiceInputParser or complexServiceInputParser
@sequences = getNodeContentWithArticle($input, "String", "SequenceString");
@cutoffs = getNodeContentWithArticle($input, "Parameter", "cutoff");
EXAMPLE :
my $inputs = complexServiceInputParser($MOBY_mssage));
# $inputs->{$queryID} = [ [TYPE, $DOM], [TYPE, $DOM], [TYPE, $DOM] ]
my (@enumerated) = keys %{$inputs};
foreach $no (@enumerated){
my @articles = @{$inputs->{$no}};
foreach my $article(@articles){
my ($type, $DOM) = @{$article};
if ($type == SECONDARY){
$cutoff = getNodeContentsWithArticle($DOM, "Parameter", "cutoff");
} else {
$sequences = getNodeContentWithArticle($DOM, "String", "SequenceString");
}
}
}
name : validateNamespaces
function : checks the namespace ontology for the namespace lsid
usage : @LSIDs = validateNamespaces(@namespaces)
args : ordered list of either human-readable or lsid presumptive namespaces
returns : ordered list of the LSID's corresponding to those
presumptive namespaces; undef for each namespace that was invalid
name : validateThisNamespace function : checks a given namespace against a list of valid namespaces usage : $valid = validateThisNamespace($ns, @validNS); args : ordered list of the namespace of interest and the list of valid NS's returns : boolean
name : getResponseArticles
function : get the DOM nodes corresponding to individual
Simple or Collection outputs from a MOBY Response
usage : ($collections, $simples) = getResponseArticles($node)
args : $node - either raw XML or an XML::DOM::Document to be searched
returns : an array-ref of Collection article XML::DOM::Node's
an array-ref of Simple article XML::DOM::Node's
name : getServiceNotes function : to get the content of the Service Notes block of the MOBY message usage : getServiceNotes($message) args : $message is either the XML::DOM of the MOBY message, or plain XML returns : String content of the ServiceNotes block of the MOBY Message
name : getCrossReferences
function : to get the cross-references for a Simple article
usage : @xrefs = getCrossReferences($XML)
args : $XML is either a SIMPLE article (<Simple>...</Simple>)
or an object (the payload of a Simple article), and
may be either raw XML or an XML::DOM node.
returns : an array of MOBY::CrossReference objects
example :
my (($colls, $simps) = getResponseArticles($query); # returns DOM nodes
foreach (@{$simps}){
my @xrefs = getCrossReferences($_);
foreach my $xref(@xrefs){
print "Cross-ref type: ",$xref->type,"\n";
print "namespace: ",$xref->namespace,"\n";
print "id: ",$xref->id,"\n";
if ($xref->type eq "Xref"){
print "Cross-ref relationship: ", $xref->xref_type,"\n";
}
}
}
name : whichDeepestParentObject
function : select the parent node from nodeList that is
closest to the querynode
usage : ($term, $lsid) = whichDeepestParentObject($CENTRAL, $queryTerm, \@termList)
args : $CENTRAL - your MOBY::Client::Central object
$queryTerm - the object type I am interested in
\@termlist - the list of object types that I know about
returns : an ontology term and LSID as a scalar, or undef if there
is no parent of this node in the nodelist.
(note that it will only return the term if you give it
term names in the @termList. If you give it
LSID's in the termList, then both the parameters
returned will be LSID's - it doesn't back-translate...)
Usage : $object->_rearrange( array_ref, list_of_arguments)
Purpose : Rearranges named parameters to requested order.
Example : $self->_rearrange([qw(SEQUENCE ID DESC)],@param);
: Where @param = (-sequence => $s,
: -desc => $d,
: -id => $i);
Returns : @params - an array of parameters in the requested order.
: The above example would return ($s, $i, $d).
: Unspecified parameters will return undef. For example, if
: @param = (-sequence => $s);
: the above _rearrange call would return ($s, undef, undef)
Argument : $order : a reference to an array which describes the desired
: order of the named parameters.
: @param : an array of parameters, either as a list (in
: which case the function simply returns the list),
: or as an associative array with hyphenated tags
: (in which case the function sorts the values
: according to @{$order} and returns that new array.)
: The tags can be upper, lower, or mixed case
: but they must start with a hyphen (at least the
: first one should be hyphenated.)
Source : This function was taken from CGI.pm, written by Dr. Lincoln
: Stein, and adapted for use in Bio::Seq by Richard Resnick and
: then adapted for use in Bio::Root::Object.pm by Steve Chervitz,
: then migrated into Bio::Root::RootI.pm by Ewan Birney.
Comments :
: Uppercase tags are the norm,
: (SAC)
: This method may not be appropriate for method calls that are
: within in an inner loop if efficiency is a concern.
:
: Parameters can be specified using any of these formats:
: @param = (-name=>'me', -color=>'blue');
: @param = (-NAME=>'me', -COLOR=>'blue');
: @param = (-Name=>'me', -Color=>'blue');
: @param = ('me', 'blue');
: A leading hyphenated argument is used by this function to
: indicate that named parameters are being used.
: Therefore, the ('me', 'blue') list will be returned as-is.
:
: Note that Perl will confuse unquoted, hyphenated tags as
: function calls if there is a function of the same name
: in the current namespace:
: -name => 'foo' is interpreted as -&name => 'foo'
:
: For ultimate safety, put single quotes around the tag:
: ('-name'=>'me', '-color' =>'blue');
: This can be a bit cumbersome and I find not as readable
: as using all uppercase, which is also fairly safe:
: (-NAME=>'me', -COLOR =>'blue');
:
: Personal note (SAC): I have found all uppercase tags to
: be more managable: it involves less single-quoting,
: the key names stand out better, and there are no method naming
: conflicts.
: The drawbacks are that it's not as easy to type as lowercase,
: and lots of uppercase can be hard to read.
:
: Regardless of the style, it greatly helps to line
: the parameters up vertically for long/complex lists.
|
CommonSubs |