The XML Information Set or Infoset (http://www.w3.org/TR/xml-infoset) is a recommendation from the W3C that describes an abstract data set whose definitions can be used to describe well-formed XML documents (documents don't have to be valid). These definitions are set forth so that other W3C specs can use the same terminology and not trip over each other's shoelaces. An infoset is supposed to describe the result of parsing an XML document; it can also be constructed by other means, such as in a Document Object Model (DOM) tree (http://www.w3.org/TR/xml-infoset/#intro.synthetic). Normally, you don't hear folks talk about structures in XML documents using the terms defined in this spec. The infoset consists of a set of 11 information items, each with a set of properties. The following list briefly outlines these information items and their associated properties: - Document information item
-
Properties: all declarations processed, base URI, character encoding scheme, children, document element, notations, standalone, unparsed entities, version - Element information item
-
Properties: attributes, base URI, children, in-scope namespaces, local name, namespace attributes, namespace name, parent, prefix - Attribute information item
-
Properties: attribute type, local name, namespace name, normalized value, owner element, prefix, references, specified - Processing instruction information item
-
Properties: base URI, content, notation, parent, target - Unexpanded entity reference information item
-
Properties: declaration base URI, name, parent, public identifier, system identifier - Character information item
-
Properties: character code, element content whitespace, parent - Comment information item
-
Properties: content, parent - Document type declaration information item
-
Properties: children, parent, public identifier, system identifier - Unparsed entity information item
-
Properties: declaration base URI, name, notation, notation name, public identifier, system identifier - Notation information item
-
Properties: declaration base URI, name, public identifier, system identifier - Namespace information item
-
Properties: namespace name, prefix | If you need help understanding the meanings behind the individual information items and properties, consult the spec. There isn't enough space in this little hack to explain them all here. Applying the stylesheet infoset.xsl should help you understand better what the infoset describes. |
|
To help you understand the infoset better, the file archive includes infoset.xsl, an XSLT 2.0 stylesheet. The reason I used XSLT 2.0 is that it has more facilities for creating an infoset implementation than XSLT 1.0. infoset.xsl is only a partial XSLT implementation of the reporting infoset. To use the stylesheet, you need an XSLT 2.0 processor, such as Saxon 8.0 or later (http://saxon.sourceforge.net). Saxon 8.0 isn't a complete XSLT 2.0/XPath 2.0 implementation, but it's getting closer. Download and unzip Saxon, and place saxon8.jar in the working directory where you installed the archive of files that came with the book. You'll need Java Version 1.4 or later, too. You can apply this stylesheet to any XML document, as demonstrated here: java -jar saxon8.jar prefix.xml infoset.xsl Your results will be as follows: Comment information item (1) [content]: a time instant [parent]: / Document information item [document element]: time [base URI]: file:/C:/Hacks/examples/115959p.m. Element information item (document element) [namespace]: http://www.wyeast.net/time [local name]: time [prefix]: tz [children]: [attributes]: timezone [base URI]: file:/C:/Hacks/examples/115959p.m. Element information item (1) [namespace]: http://www.wyeast.net/time [local name]: hour [prefix]: tz [children]: 11 [attributes]: [parent]: tz:time [base URI]: file:/C:/Hacks/examples/11 Element information item (2) [namespace]: http://www.wyeast.net/time [local name]: minute [prefix]: tz [children]: 59 [attributes]: [parent]: tz:time [base URI]: file:/C:/Hacks/examples/59 Element information item (3) [namespace]: http://www.wyeast.net/time [local name]: second [prefix]: tz [children]: 59 [attributes]: [parent]: tz:time [base URI]: file:/C:/Hacks/examples/59 Element information item (4) [namespace]: http://www.wyeast.net/time [local name]: meridiem [prefix]: tz [children]: p.m. [attributes]: [parent]: tz:time [base URI]: file:/C:/Hacks/examples/p.m. Element information item (5) [namespace]: http://www.wyeast.net/time [local name]: atomic [prefix]: tz [children]: [attributes]: signal [parent]: tz:time [base URI]: file:/C:/Hacks/examples/ |