Processing XML with Javaв„ў: A Guide to SAX, DOM, JDOM, JAXP, and TrAX

XPath is a straightforward declarative language for selecting particular subsets of nodes from an XML document. Its data model is not quite the same as DOM's data model, but that's not normally a major problem. In fact, in some cases, such as taking the string value of an element, the XPath data model is likely to be a lot closer to what you want than the DOM data model.

XPath location paths are composed of one or more location steps. Each location step has an axis and a node test, and may have one or more predicates. Each location step is evaluated with respect to the context nodes determined by the previous step in the path . The axis determines the direction in which you move from the context node. The node test determines which nodes are selected along that axis, and the predicate decides which of the selected nodes are retained in the set.

A location path is actually just one kind of the more generic XPath expressions. In addition to node-sets , XPath expressions can return doubles, strings, and booleans, which are pretty much the same as the Java types of the same name , with a few minor differences you normally don't have to worry about. XPath offers the usual arithmetic and relational operators for working with these data types, as well as a library of more than two dozen useful functions.

Most XSLT processors have APIs that allow you to search XML documents with XPath expressions. The two most popular are Saxon and Xalan. Saxon's API requires a custom DOM, whereas Xalan can work with pretty much any complete and correct DOM implementation.

DOM Level 3 XPath is a developing standard for using XPath in DOM programs that can be implemented across different processors, although as yet it isn't implemented by any. It provides a reasonably simple API for saying, "Here's a document, a context node, and a location path. Find me all the nodes from the document that match."

Jaxen is a somewhat more ambitious cross-model effort to model XPath expressions themselves rather than just treating them as opaque strings. For example, Jaxen provides Java classes that represent all of the different XPath functions, enabling you to pass Java objects such as a Node or an Element directly to XPath functions such as normalize-space() or namespace-uri() . More important, Jaxen works across different XML object models including not just DOM, but also JDOM, dom4j, and ElectricXML.

Категории