XML in a Nutshell, Third Edition

     

In HTML, comments are sometimes abused to support nonstandard extensions. For instance, the contents of the script element are sometimes enclosed in a comment to protect it from display by a nonscript-aware browser. The Apache web server parses comments in .shtml files to recognize server-side includes. Unfortunately, these documents may not survive being passed through various HTML editors and processors with their comments and associated semantics intact. Worse yet, it's possible for an innocent comment to be misconstrued as input to the application.

XML provides the processing instruction as an alternative means of passing information to particular applications that may read the document. A processing instruction begins with <? and ends with ?> . Immediately following the <? is an XML name called the target , possibly the name of the application for which this processing instruction is intended or possibly just an identifier for this particular processing instruction. The rest of the processing instruction contains text in a format appropriate for the applications for which the instruction is intended.

For example, in HTML, a robots META tag is used to tell search-engine and other robots whether and how they should index a page. The following processing instruction has been proposed as an equivalent for XML documents:

<?robots index="yes" follow="no"?>

The target of this processing instruction is robots . The syntax of this particular processing instruction is two pseudo-attributes, one named index and one named follow , whose values are either yes or no . The semantics of this particular processing instruction are that if the index attribute has the value yes , then search-engine robots should index this page. If index has the value no , then robots should not index the page. Similarly, if follow has the value yes , then links from this document will be followed; if it has the value no , they won't be.

Other processing instructions may have totally different syntaxes and semantics. For instance, processing instructions can contain an effectively unlimited amount of text. PHP includes large programs in processing instructions. For example:

<?php mysql_connect("database.unc.edu", "clerk", "password"); $result = mysql("HR", "SELECT LastName, FirstName FROM Employees ORDER BY LastName, FirstName"); $i = 0; while ($i < mysql_numrows ($result)) { $fields = mysql_fetch_row($result); echo "<person>$fields[1] $fields[0] </person>\r\n"; $i++; } mysql_close( ); ?>

Processing instructions are markup, but they're not elements. Consequently, like comments, processing instructions may appear anywhere in an XML document outside of a tag, including before or after the root element. The most common processing instruction, xml-stylesheet , is used to attach stylesheets to documents. It always appears before the root element, as Example 2-6 demonstrates . In this example, the xml-stylesheet processing instruction tells browsers to apply the CSS stylesheet person.css to this document before showing it to the reader.

Example 2-6. An XML document with a processing instruction in its prolog

<?xml-stylesheet href="person.css" type="text/css"?> <person> Alan Turing </person>

The processing instruction names xml , XML , XmL , etc., in any combination of case, are forbidden in order to avoid confusion with the XML declaration. Otherwise, you're free to pick any legal XML name for your processing instructions.

Категории