This document provides semantics of the Extended XPath language (EXPath) for Concurrent Markup Hierarchies (CMH).
Extended XPath (EXPath) is an extension of regular XPath to provide selection of nodes in a GODDAG.
One key difference between EXPath and XPath is the return type of a location step evaluation: in EXPath a location step is evaluated to a node-set-collection: a node-set per each hierarchy. Consequently, the context of an expression evaluation is the same as the context of an XPath expression, with the following amendments:
xdescendant: includes all nodes in GODDAG whose text ranges are
included in the text range of the current context node, excluding the current context node.
xdescendant-or-self: is the xdescendant set of the
current context node plus the current context node.
xancestor: includes all nodes in GODDAG whose text ranges
include the text range of the current context node, excluding the current context node.
xancestor-or-self: is the xancestor set of the
current context node plus the current context node.
xfollowing: includes all nodes in GODDAG whose text ranges
follow the text range of the current context node.
xpreceding: includes all nodes in GODDAG whose text ranges
precede the text range of the current context node.
preceding-overlapping: includes all nodes in GODDAG whose text ranges
contain (not on the border) the start tag, but not the
end tag, of the current context node.
following-overlapping: includes all nodes in GODDAG whose text ranges
contain (not on the border) the end tag, but not the
start tag, of the current context node.
overlapping: is the union of preceding-overlapping
and following-overlapping sets of the current context node.
The following extensions of the XPath node tests are added:
text(String): the node test is evaluated to true
if and only if the context node is a text node in the
hierarchy or hierarchies given in the string parameter. The String
parameter is a comma-separated list of hierarchies names.
node(String): the node test is evaluated to true
if and only if the context node is any node type in the
hierarchy or hierarchies given in the string parameter. The String
parameter is a comma-separated list of hierarchies names.
*(String): the node test is evaluated to true
if and only if the context node is a element node in the
hierarchy or hierarchies given in the string parameter. The String
parameter is a comma-separated list of hierarchies names.
The following node test is added in EXPath:
leaf: the node test
is evaluated to true if and only if the context node is a leaf
(see Data Model).
A union ("|") operation of two node-set-collection yields a node-set-collection result containing a node-set per component hierarchy. Each node-set in the result is obtained from the union of the node sets in the same hierarchy of the operands.
The last function returns a number equal to the context size from the EXPath expression evaluation context.
The position function returns a number equal to the context position from the EXPath expression evaluation context.
Function: number count(node-set-collection)
The count function returns the number of nodes in the argument node-set-collection.
Function: node-set-collection id(object)
The id function selects elements by their unique ID as in id function in XPath. Note that the result type is node-set-collection.
Function: string local-name(node-set-collection?)
The local-name function applies the local-name function of XPath for each node-set in the node-set-collection argument and returns the string concatenation, using a blank space as separator, of all returned strings.
Function: string namespace-uri(node-set-collection?)
The namespace-uri function applies the namespace-uri function of XPath for each node-set in the node-set-collection argument and returns the string concatenation, using a blank space as separator, of all returned strings.
Function: string name(node-set-collection?)
The name function applies the name function of XPath for each node-set in the node-set-collection argument and returns the string concatenation, using a blank space as separator, of all returned strings.
Function: string hierarchy(), boolean hierarchy(String)
The hierarchy function returns the document hierarchy ID of the context node (the first version) or returns true if the context node belongs to the hierarchy given as parameter or false otherwise (the second version).
Function: string string(object?)
The string function converts an object to a string as follows:
A node-set-collection is converted to a string by returning the string-value of the node in the node-set that is first in document order. If the node-set is empty, an empty string is returned.
An object argument different from a node-set-collection is converted to a string as in function string of XPath.
If the argument is omitted, it defaults to a node-set with the context node as its only member.
The semantics of the other string functions in the core functions library of XPath is unchanged.
Function: string toLowerCase(string)
The toLowerCase function returns the lower case string version of the string taken as parameter.
Function: string toUpperCase(string)
The toUpperCase function returns the upper case string version of the string taken as parameter.
Function: boolean boolean(object)
The boolean function converts its argument to a boolean as follows:
A node-set-collection is true if and only if it contains a non-empty node-set
An object argument different from a node-set-collection is converted to a boolean as in function boolean of XPath.
The semantics of the other boolean functions in the core functions library of XPath is unchanged.
Function: boolean matches(string,
string)
The matches function returns true if and only if the first string argument matches the RE in the second string argument; otherwise (including the case of invalid RE) it returns false. For more information about the RE please check the Java.lang.String.matches() documentation.
Function: number number(object?)
The number function converts its argument to a number as follows:
a node-set-collection is first converted to a string as if by a call to the string function and then converted in the same way as a string argument
An object argument different from a node-set-collection is converted to a boolean as in function number of XPath.
If the argument is omitted, it defaults to a node-set-collection with a node-set containing the context node as its only member.
Function: number sum(node-set-collection)
The sum function returns the sum, for each node in the argument node-set-collection, of the result of converting the string-values of the node to a number.
The semantics of the other number functions in the core functions library of XPath is unchanged.
For representing a distributed XML document we use the General Ordered-Descendant Directed Acyclic Graph (GODDAG) data structure proposed by Sperberg-McQueen and Huitfeldt. Informally, a GODDAG for a distributed XML document can be thought of as the graph that unites the DOM trees of individual components, by merging the root node and the text nodes. However, because of possible overlap in the scopes of XML elements from different component documents, GODDAGs will feature one more node type, that we call here leaf node, not found in DOM trees. In a GODDAG, leaf nodes are children of the text nodes, and they represent a consecutive sequence of content characters that is not broken by an XML tag in any of the components of the distributed XML document. While each CMH component will have its own text nodes in a GODDAG, the leaf nodes will be shared among all of them.
In a GODDAG we have the following types of nodes: root node (unique for GODDAG), element nodes, attribute nodes, text nodes, and leaf nodes (see the figure below). Note that, in the figure below, the root node at the bottom is the same with the root node at the top: for simplicity they were distinctly drawn.
The string-value of a node in GODDAG is evaluated as a string-value of the node in its respective hierarchy.
/descendant::dmg/descendant::text()
/descendant::w[xancestor::dmg or xdescendant::dmg or overlapping::dmg]
/descendant::w[xancestor::dmg and xdescendant::dmg]
/descendant::dmg/descendant::text()[xancestor::res]
/descendant::dmg/xdescendant::w[descendant::text()[xancestor::res]]