IDevResource.com - XML Channel - Professional VB6 XML by Wrox Press

Now we have an easy and platform-independent method of describing XML data, validating its type as we wish and modifying and reading it programmatically. So we basically have a transportable miniature database. No surprise then that when you start to work with it, you'll feel the need for a query mechanism. Using the DOM, you can get to each and every node in your document, but it can get tiresome, maneuvering through the hierarchies of children to find that single node you are interested in.

What we would like to have is an XML version of SQL. We would like to say "Get me all nodes of type X that have descendants of type Y". Many initiatives in this direction have been started up. There were some working groups specifying only a query language, but query mechanisms were also part of the drafts under development for transformation (XSLT) and linking technologies (XPointer). Then the W3C joined efforts with some of the working groups to specify XPath. XPath is a simple syntax to select a subset of the nodes in a document. It now has recommendation status and is used in both the XSLT and XPointer standards (as we'll see later in this chapter and in the next chapter).

Later in this chapter you will understand the importance of XPath in the context of transforming one document type to another, but first we will look at using XPath as a pure querying tool. In the initial release of IE5, a basic version of XPath implementation was included (then called XQL). Once XPath and XSLT gained recommendation status, Microsoft promised to deliver a fully compliant implementation of XPath and XSLT soon, and in January 2000 Microsoft shipped a developers preview of the MSXML library. In the appendices for XPath and XSLT, you can find exactly which features are supported in which releases.

We will work with the full version of XPath in this chapter. If you want to program for the MSXML library that came with IE5 originally (if you cannot update to the newer version on all installed versions), you are restricted to a subset of XPath. We will indicate what can be used in the earlier IE5 versions in a separate section.

Be aware of the fact that several (more powerful) XML query languages are still under development. These include a syntax called XQL, that has firm support from IBM and an initiative from the W3C, called XML Query, which is still in the first stages of specification. At the moment, XPath is the only way that has reached recommendation status and it looks like it will be a long time before anything else will.

XPath Query Syntax

Before we get into the syntax of an XPath query, we have to discuss the concept of a context node. In XPath, a query is not automatically done over the whole of the content, but always has a starting point or context node. This can be any node in the node tree that constitutes the document. From this "fixed point" you can issue queries like "give me all your children". This kind of query only makes sense if there is a starting point defined. This starting point may be the root node, of course, which would query the entire document.

Different Axes

This query would translate to plain English as: “Get the TABLE elements from all descendants (children, children's children, etc) of the context node". The first part of this query, descendant, is called the axis of the query. The second part, TABLE, is called the node test. The axis is the searching direction; if a node along the specified axis conforms to the node test, it is included in the result set. These patterns can be very complex and can have subqueries in them. We will look at that later. First, we will list all available all axes that can be used in a query:

Axis	Description
child	All direct children of the context node. Excludes attributes.
descendant	All children and children's children etc… Excludes attributes.
parent	The direct parent (and only the direct parent) of the context node (if any).
ancestor	All ancestors of the context node. Always includes the root node (unless the root node is the context node).
following-sibling	All siblings to the context node that appear later in the document.
preceding-sibling	All siblings to the context node that appear earlier in the document.
following	All nodes in the document that come after it (in document order).
preceding	All nodes in the document that come before it (in document order).
attribute	Contains the attributes of the context node.
namespace	Contains the namespace nodes of the context node. This includes an entry for the default namespace and the implicitly declared XML namespace.
Self	Only the context node itself.
descendant-or-self	All descendants and the context node itself.
ancestor-or-self	All ancestors and the context node itself.

The ancestor, descendant, following, preceding and self axes partition the document. This means that these five axes together contain all nodes of the tree (except attributes and namespaces), but do not overlap. This means that an ancestor is not on the preceding axis and that a descendant is not on the following axis, as illustrated in the following diagram:

Different Node Tests

The sample we showed before used a literal name (TABLE) as a node test. This is only one of the ways to specify what a selected node should look like. Other valid values are:

q * – which is true for any node of the principal type and every axis has its own principal node type. For most axes the principal node type is 'element', but for the attribute axis it is 'attribute' and for the namespace axis, the principal type is 'namespace'.

q processing-instruction() – which is true for all processing instruction nodes.

These node type tests take no arguments. Only the processing-instruction can be passed a literal; if an argument is passed, the node test is only true for a processing instruction that has a name equal to the argument.

The following are examples of XPath queries using different axes and node tests. This selects all descendant elements from the context node:

This means that it includes the default namespace, the xml namespace, any namespaces that are declared in the context node, and any namespaces declared in ancestors of the context node that have not been overruled by declarations in their children. The overruling of a namespace happens when one element declares a prefix to a certain URI and a child node declares a namespace with the same prefix, but with another URI. In this case, the first declaration is removed and becomes invisible from nodes that are descendants of the element wit the second declaration.

Finally, this query selects all comment nodes that are a direct child of the context node:

Building a Path

Several of the XPath expressions we have seen up until now can be appended to each other to form a longer expression. This is done in a way similar to building a full directory path from several directory names: by separating them with forward slashes. The first expression in the path is evaluated in the original context; the result set from this expression forms the context for the next. Each of the nodes in the result set is used as context for the expression that follows and all the results of each query are combined to one result set at the end. This would work as follows. This command selects the parent element of all name elements along the descendant axis of our context node:

This selects all text nodes from paragraph elements that are children of chapter elements that are children of book elements that are children of our context node:

Absolute vs. Relative Paths

Just as with directory paths, we can make the XPath expression absolute by prefixing a slash. This sets the expression context to the document root. This is not the root element (compare with the documentElement attribute of the DOMDocument object in the DOM), but the parent of the root element (compare with the DOMDocument object itself). This example would select all attributes on the root element:

However, the next example would select nothing, because the document root cannot carry attributes:

Abbreviated Form / IE5 Compatible Form

The abbreviated notation of XPath is intended to keep the queries shorter. But the most important reason to learn the shorthand notation is that it used to be the full notation (according to the working draft) at the moment that the first release of IE5 hit the shops. In fact, it wasn't even called XPath back then, but was part of the XSL specification, which was later split up in three parts. (More on that later in this chapter.) That's why in IE5, only the shorthand syntax of XPath is implemented (in January 2000; Microsoft released a preview of the newer version of the library, which will support the full XPath specification). The main rules for the abbreviated syntax are:

Shorthand Rule	Example
The child axis is the default axis	TABLE equals child::TABLE
The attribute axis can be abbreviated to the prefix @	@name equals attribute::name
The self axis can be abbreviated to .	. equals self::*
The parent axis can be abbreviated to ..	.. equals parent::*
The descendant axis can be abbreviated to //	//* equals /descendant::* .//* equals descendant::*

So these XPath expressions are valid in the IE5 implementation. The first command returns all ID attributes from TABLE elements in the whole document:

While this returns all text nodes that are children of PARAGRAPH elements that are children of CHAPTER elements that are children of the context element:

Selecting Subsets

Now we have seen most of the basic elements of building XPaths. There is only one more to discuss: predicates. Predicates are a way to select a subset from a result set in an XPath (or part of an XPath). An XPath with a predicate looks like this:

The axis and node test we have already seen. Now the predicate expression gets appended in square brackets. Basically, what the predicate does is place a filter on the result set. For each node in the set, the XPath processor will test the predicate expression.

The Expression is True/False

If the expression evaluates to true, the node remains in the result set; if it evaluates to false, the node is removed. The predicate can contain special XPath functions (we will see those later, although we already met with text(), comment() etc), numeric values and XPath expressions. This XPath expression would return the second child element named chapter from the context node.

The position() function returns the position of the context node in its set. The set is the result of the node test child::chapter. For the first node in the set, position() will return 1, for the second 2, etc. The expression position() < 2 evaluates to true only for the first and second chapter elements found.

The Expression Returns a Number

If the expression evaluates to a numerical value n, it is only true for the nth node. If the value is 2, only the second node in the set will remain in the set, the rest will be deleted. The next example will return only the first chapter element found among the children of the context node.

The number can also be the result of a calculation. The last() function returns the number of nodes in the result set of the current context node. Using this numeric value we can select the last chapter:

The Expression Returns a Node Set

If the result of the expression is a node set, the context node is included if there are nodes in the node set. The context node is deleted if the returned node set is empty. The expression can itself be an XPath expression (with axes, node tests and predicates). The inner XPath is evaluated with the outer XPath result as its context. This is a powerful concept; it allows us to make sub-querying constructions. The next example selects only those chapter elements that have para elements among their children:

The outer XPath expression selects all chapter elements from the children of the context node. Then, taking each of these chapter elements as context, it tries to select para elements from their children. The chapters that have an empty set of results are removed from the result set of the outer XPath expression.

This query selects all messages that are a descendant of the context query and have an ID attribute. Note that the results of this query are the message elements, not ID attributes:

Here a node, the attribute confidentiality, is compared with a literal string value:

In these cases, XPath compares the string value of the node with the literal string value. If they are identical, the expression is true. If the literal is numerical, the string value of the node is converted to a numerical value and then compared. If a node set is compared with a literal value, the expression is true if one of the elements in the set is identical to the literal value. If two node sets are compared, the result is true if any one node from the first can be matched with any one node from the second.

So in the example above, the predicate is true, if the context node has a confidentiality attribute with value 'secret'. Only if this is the case, the message will be selected.

Note that with this form of comparing, these two expressions are not identical:

The first query selects all descendants that have an attribute with value 'Teun'. The second one selects only descendants with all attributes set to 'Teun' (!= means 'not equal to'). If you don't immediately understand this, try to figure out when this query evaluates to true:

It selects all descendants that have an attribute that does not have the value 'Teun'. The reverse of this is selecting all descendants that have no attribute that has not the value 'Teun', which is identical to selecting only descendants that have all attributes set to value 'Teun'. In expressions like these, you can use the following operators:

=	Equal to.
!=	Not equal to.
<, <=, >, >=	Less than, less than or equal to, greater than, greater than or equal to.
and, or	Logical and, or.
+, -, *	Addition, subtraction, multiplication. Because – can be part of a valid name and * can be used to indicate an arbitrary name, you have to make sure they cannot be interpreted wrongly by leaving white space before the operator.
div	Division (floating point).
mod	Integer remainder of a division.
\|	Union of two node sets (creates a new node set holding all elements in the two node sets).

The filtered result set returned by an XPath expression with a predicate expression can be further filtered by appending another predicate to it. This example selects the fifth employee that has a function child element with the value 'manager'. We first select all employee nodes along the descendant axis, and then filter them with the [function='manager'] predicate. From this filtered result set, we again filter only the fifth element with the predicate [5].

The following example looks very much the same, but selects the fifth employee element, but only if it has a child element of type function with the value 'manager'. Otherwise it will return an empty node set.

Built-in Functions

As we have already seen, in the writing of predicates, functions that perform complex operations are very handy, if not absolutely necessary. Some of them we have already seen in some of the samples presented. We will show some important functions here, but all other built-in functions specified by the XPath recommendation are listed in Appendix C.

Node Set Functions

last()

The last() function returns the index number of the last node in the context. For example this command selects the chapter elements (along the child axis, which is the default axis in the shorthand notation) that have exactly 5 paragraph children:

position()

The position() function returns the position of the current context node in the current result set. For example this command selects the chapter children that have a fifth paragraph:

Note that here we create a predicate to filter the results of the outer expression, and this predicate uses an XPath expression that also has a predicate. This recursive use of XPath expressions in predicates is a powerful feature way to create sub-queries.

count(node set)

The count() function returns the number of nodes in the node set passed to it. This seems identical to the last() function, but it isn't; the context it works on is different. It can be used to do more or less equal things, but the syntax would be different. This example selects the chapters with exactly five paragraph children (identical to the example for the last() function):

Whereas this selects the chapters with five or more paragraph children (identical to the example for the position() function):

id(object)

The id() function returns nodes that have the specified ID attribute. If the object passed to the function is a node set, each of the elements is converted to its string value. The function then returns all elements in the document that have one of the ID values in the set.

If the passed object is anything else, the query parser tries to convert it to a string and returns the element from the document that has this string for an ID. This can, by definition, be only one element, for example:

This query returns all nodes that have an ID that matches the content of the authors attributes on books that have their publisher attribute set to 'WROX'. This kind of query can be extremely powerful. However, they demand that the document is validated against a schema or DTD, because without validation, the processor cannot know which attributes are IDs. For doing things like this with invalidated documents, see the section on using keys.

namespace-uri(node-set)

If your application has to act only on information in a specific namespace (this is in fact very probable as soon as you are building real applications), you will love the namespace-uri() function. It returns a string containing the URI of the namespace of the passed node set. Normally, the node set you pass will only contain one node. In fact, if you pass a node set containing multiple nodes, the function will use the first node in the set. So, if the node you pass is an element of type mydata:chapter, the function will look for the declaration of the mydata namespace and will return the value of the URI used there.

String Functions

For the handling of strings, several functions are included. We will not get into these in very much depth. Most are what you would expect from string handling functions. They cover concatenation, comparing and manipulating strings, and selecting a substring from a string. We will show just a few functions here; refer to Appendix C for the complete list.

string(object)

This function converts the passed object to a string. This may be a Boolean value that is converted to 'true', or a number value converted to its string value (i.e. the number 3 would be converted to the string "3"). If a node set is passed, the first node in the set is used.

starts-with(string, string)

This is for checking if the first string starts with the second string. The function returns true if so, otherwise false. For example, this query returns all employee elements that have a last-name attribute that starts with an 'A':

translate(string, string, string)

The translate function takes a string and, character-by-character, translates characters which match the second string into the corresponding characters in the third string. This is the only way to convert from lower to upper case in XPath. That would look like this (with extra white space added for readability). This code would translate the employee last names to upper case and then select those employees whose last names begin with A.

descendant::employee[

starts-with(

translate(@last-name,

"abcdefghijklmnopqrstuvwxyz",

"ABCDEFGHIJKLMNOPQRSTUVWXYZ"),

"A"

)

]

If the second string has more characters than the third string, these extra characters will be removed from the first string. If the third string has more characters than the second string, the extra characters are ignored.

Number Functions

As for strings, a set of functions is available for number handling, but we will not list them all here. They are available in Appendix C. We will show a few of the most important and instructive examples.

number(object)

The number() function converts any passed value to a number. Its behaviour depends on the type of the passed parameter. Some possible situations:

q If a string is passed, the value of the string is converted to the mathematical value that it displays (following the IEE 754 standard).

q If a node set is passed, it is first converted to a string (as if using the string() function). Then the string is converted to a number.

The number function has no support for language-specific formats. The string value passed in should be of a language neutral format.

sum(node set)

The sum() function returns the sum of the numerical values of all passed nodes. The numerical value is the result of the conversion of their string values. For example, this query selects the industry elements that have customer elements as children, whose totalturnover attributes sum to an amount larger than 1 million:

round(number)

The round() function is a typical number function. It rounds a floating point value to the nearest integer value. Other ways of making an integer from a floating point value are floor() and ceiling().

Boolean Functions

The functions that handle Boolean values are not very special. The only really useful one is the not() function, which converts a Boolean value to its opposite. Other than that, there are the true() and false() functions that always return true and false respectively, and the lang() function that can be used to check the language of the content (if this is indicated with the xml:lang attribute).

IE5 Conformance

IE5 implements a subset of XPath. If you are developing for the MSXML objects in the initial IE5 release, you have to know what features are implemented in XPath and which are not. Microsoft has committed to implementing the full standard in all later versions. It is unclear what backward compatibility will exist with syntax elements that are not part of the W3C recommendation. Here we will show the differences between the IE5 implementation and the W3C recommendation 1.0.

Axes

IE5 knows only the abbreviated syntax for axis and node test. You cannot use the syntax with a double colon. This limits the number of axes, because not all XPath axes have an abbreviated form (for example namespace, ancestor, following, preceding).

Functions

Not all of the built-in functions of XPath are supported in IE5. The most notable difference is the last() function which is called end() in IE5. Also, many functions are not supported at all. Here is a full list of the supported functions in IE5:

q node() returns all nodes (except attributes and the root node) that are children of the context node.

q pi() returns all processing instruction nodes that are children of the context node.

q text() returns all nodes that represent a text value, that are children of the context node. This includes both text nodes and CDATA nodes.

Examples

In the source code download, you will find a small Visual Basic application, called XPathTester.vbp, which allows you to both practice the writing of XPath queries and test their performance. If you start the application, you will see this form:

The Query tester frame can be used once an XML document is loaded. After loading such a document (take a big one like Macbeth.xml), you can see the structure of the document in the tree view control to the left. If you select a node and type an XPath expression in the Query text box, you can execute this query, using the selected node as your context node. All matching nodes are listed in the list box. If you click a list item, the underlying XML source is shown in the text box on the right. Note that the number of seconds needed for performing the query is shown directly under the list box. Use this application to practice writing queries. Notice how more specific queries have a better performance than very general ones. Also, queries that specify the structural relations of elements are much faster than queries specifying the text content of elements and attributes.