XSLT for Dummies
|
Chapter 15 - Namespaces Revisited | |
XSLT For Dummies | |
by Richard Wagner | |
Hungry Minds 2002 |
You can also work with multiple namespaces within the source and result documents. A Microsoft Word 2000 document provides a practical example of how you can use multiple namespaces within a single document. Word 2000 can save documents as HTML so that you can easily view the text in a Web browser. But Microsoft Word 2000 preserves Word-specific information by embedding XML and other proprietary markup text inside the Web document. This technique enables you to view the document as an HTML page but still reopen the document in Word, retaining all the additional formatting and setting information Word needs. Word uses multiple namespaces to handle different parts of the document data:
Knowing this, I can create a Word document from an XML structure by using XSLT to transform basic XML code into a format that Word can understand and process. In Listing 15-2, I use the following XML version of a letter as the source document.
Listing 15-2: annchovie.xml
<?xml version="1.0"?> <!-- annchovie.xml --> <letter xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word"> <o:properties> <o:Author>Ann Chovie</o:Author> <o:Revision>2</o:Revision> <o:Company>Fisher Brothers</o:Company> </o:properties> <w:properties> <w:View>Print</w:View> <w:Zoom>150</w:Zoom> <w:DoNotOptimizeForBrowser/> </w:properties> <lettertext> <para >Ann Chovie</para> <para >233 Phish Lane</para> <para >Guppie Hill, VT 12032</para> <para >March 3, 2002</para> <para >Dear Editor, </para> <para >I am canceling my subscription to <italic>Goldfish Monthly</italic> due to your recent article on the <bold><span style='color:red'>Goldfish of the Year</span></bold>. While <span style='font- variant:small-caps'>Billy the Georgian Goldfish</span> may be worthy of some sort of reward, he cannot compete with the likes of <span style='font-variant:small- caps'>Jumping Jack</span> from Jacksonville.</para> <para >I am concerned that this contest was not fair. Specifically, in your cover photo, I noticed the pebbles at the bottom of the goldfish bowl spelled out the word <italic><span style='color:red'>w-a- t-e-r-g-a-t-e</span></italic>, perhaps alluding to some sort of cover-up at your magazine. What's more, Billy's fish face has a haunting resemblance to Richard Nixon. This whole thing is starting to smell fishy to me, so I am demanding a full investigation.</para> <para >By the way, I did like your recent article on gourmet guppy food. My fish have not been happier since trying the guppuccino!</para> <para >Sincerely,</para> <para >Ann Chovie</para> </lettertext> </letter>
In this source document, the letter element defines the o and w namespaces. These are the same namespaces that Word uses, so I just need to carry them over into the result document. In my XSLT stylesheet, I add the following namespace declarations to my xsl:stylesheet instruction:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:o="urn:schemas- microsoft-com:office:office" xmlns:w="urn:schemas- microsoft-com:office:word" xmlns="http://www.w3.org/TR/REC-html40"> The o:properties element contains the Office document properties that Word uses. However, Word looks for a container element named o:DocumentProperties , so I need to rename this element during the transformation and add surrounding xml tags around the element. For the remaining o: namespace elements, I simply need to transfer them over as is. The following two template rules perform these actions:
<!-- Add DocumentProperties container, changing element name --> <xsl:template match="o:properties"> <xml> <o:DocumentProperties> <xsl:apply-templates/> </o:DocumentProperties> </xml> </xsl:template> <!-- Add individual DocumentProperties --> <xsl:template match="o:*"> <xsl:copy> <xsl:apply-templates select="@*node()"/> </xsl:copy> </xsl:template> Tip You cant create a template rule to match a namespace alone, because you cant use an XPath pattern to match namespace nodes. So, when you wish to search for specific namespaces, you must do it in combination with an element pattern. Similarly, I need to rename the w:properties element to w:WordDocument , surrounded by xml tags, and copy its child elements over to the result document: <!-- Add WordDocument container, changing element name --> <xsl:template match="w:properties"> <xml> <w:WordDocument> <xsl:apply-templates/> </w:WordDocument> </xml> </xsl:template> <!-- Add individual WordDocument properties --> <xsl:template match="w:*"> <xsl:copy> <xsl:apply-templates select="@*node()"/> </xsl:copy> </xsl:template> Before adding the text of the letter, I need to construct the head section of the HTML document. The document header contains the o: and w: XML elements, as well as some style-related information Im adding on the fly. To determine the point where I need to insert the o: and w: XML sections into the result document, I create a template rule for the letter element and use < xsl:apply-templates select="w:*"/> to apply all the elements within the w: namespace, and < xsl:apply-templates select="o:*"/> to apply all the elements within the o: namespace. After adding style information, I close the head element and then use xsl:apply-templates on the lettertext element: <!-- Letter element, add header information --> <xsl:template match="letter"> <head> <title>Ann Chovie</title> <xsl:apply-templates select="w:*"/> <xsl:apply-templates select="o:*"/> <style> p.MsoNormal, li.MsoNormal, div.MsoNormal {mso-style-parent:""; margin-top:0in; margin-right:0in; margin-bottom:12.0pt; margin-left:0in; mso-pagination:widow-orphan; font-size:12.0pt; font-family:"Times New Roman"; mso-fareast-font-family:"Times New Roman";} p.Address, li.Address, div.Address {mso-style-name:Address; margin-top:0in; margin-right:0in; margin-bottom:12.0pt; margin-left:0in; mso-pagination:widow-orphan; font-size:12.0pt; font-family:"Times New Roman"; mso-fareast-font-family:"Times New Roman";} @page Section1 {size:8.5in 11.0in; margin:1.0in 1.25in 1.0in 1.25in; mso-header-margin:.5in; mso-footer-margin:.5in; mso-paper-source:0;} div.Section1 {page:Section1;} </style> </head> <xsl:apply-templates select="lettertext"/> </xsl:template> The lettertext element contains the actual body of the letter and so needs to be surrounded by an HTML body element, as shown in the following template rule: <!-- Lettertext, convert to body --> <xsl:template match="lettertext"> <body lang="EN-US" style='tab-interval:.5in'> <div class="Section1"> <xsl:apply-templates/> </div> </body> </xsl:template> The original source includes two types of para elements, differentiated by the style attribute. In the stylesheet, address type paragraphs are converted to p elements, given a class="Address" attribute, and provided with special formatting rules, such as right alignment and no margin spacing. Normal paragraphs are also changed to p elements but given the class="MsoNormal" attribute:
<!-- Para, apply style for address and normal --> <xsl:template match="para"> <!-- Address --> <xsl:if test="@style='address'"> <p class="Address" align="right" > <xsl:apply-templates/> </p> </xsl:if> <!-- Default --> <xsl:if test="@style='default'"> <p class="MsoNormal"> <xsl:apply-templates/> </p> </xsl:if> </xsl:template> Custom formatting tags like bold and italic are transformed into the HTML-friendly formatting elements used by Word. The span elements, however, are simply carried over as is to the result document: <!-- Italic, change element name --> <xsl:template match="italic"> <i><xsl:apply-templates/></i> </xsl:template> <!-- Bold, change element name --> <xsl:template match="bold"> <b><xsl:apply-templates/></b> </xsl:template> <!-- Span, copy as is --> <xsl:template match="span"> <xsl:copy-of select="."/> </xsl:template> The complete stylesheet is shown here:
<!--annchovie.xsl --> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:o="urn:schemas- microsoft-com:office:office" xmlns:w="urn:schemas- microsoft-com:office:word" xmlns="http://www.w3.org/TR/REC-html40"> <xsl:output method="html"/> <!-- Add DocumentProperties container, changing element name --> <xsl:template match="o:properties"> <xml> <o:DocumentProperties> <xsl:apply-templates/> </o:DocumentProperties> </xml> </xsl:template> <!-- Add individual DocumentProperties --> <xsl:template match="o:*"> <xsl:copy> <xsl:apply-templates select="@*node()"/> </xsl:copy> </xsl:template> <!-- Add WordDocument container, changing element name --> <xsl:template match="w:properties"> <xml> <w:WordDocument> <xsl:apply-templates/> </w:WordDocument> </xml> </xsl:template> <!-- Add individual WordDocument properties --> <xsl:template match="w:*"> <xsl:copy> <xsl:apply-templates select="@*node()"/> </xsl:copy> </xsl:template> <!-- Letter element, add header information --> <xsl:template match="letter"> <head> <title>Ann Chovie</title> <xsl:apply-templates select="w:*"/> <xsl:apply-templates select="o:*"/> <style> p.MsoNormal, li.MsoNormal, div.MsoNormal {mso-style-parent:""; margin-top:0in; margin-right:0in; margin-bottom:12.0pt; margin-left:0in; mso-pagination:widow-orphan; font-size:12.0pt; font-family:"Times New Roman"; mso-fareast-font-family:"Times New Roman";} p.Address, li.Address, div.Address {mso-style-name:Address; margin-top:0in; margin-right:0in; margin-bottom:12.0pt; margin-left:0in; mso-pagination:widow-orphan; font-size:12.0pt; font-family:"Times New Roman"; mso-fareast-font-family:"Times New Roman";} @page Section1 {size:8.5in 11.0in; margin:1.0in 1.25in 1.0in 1.25in; mso-header-margin:.5in; mso-footer-margin:.5in; mso-paper-source:0;} div.Section1 {page:Section1;} </style> </head> <xsl:apply-templates select="lettertext"/> </xsl:template> <!-- Lettertext, convert to body --> <xsl:template match="lettertext"> <body lang="EN-US" style='tab-interval:.5in'> <div class="Section1"> <xsl:apply-templates/> </div> </body> </xsl:template> <!-- Para, apply style for address and normal --> <xsl:template match="para"> <!-- Address --> <xsl:if test="@style='address'"> <p class="Address" align="right" > <xsl:apply-templates/> </p> </xsl:if> <!-- Default --> <xsl:if test="@style='default'"> <p class="MsoNormal"> <xsl:apply-templates/> </p> </xsl:if> </xsl:template> <!-- Italic, change element name --> <xsl:template match="italic"> <i><xsl:apply-templates/></i> </xsl:template> <!-- Bold, change element name --> <xsl:template match="bold"> <b><xsl:apply-templates/></b> </xsl:template> <!-- Span, copy as is --> <xsl:template match="span"> <xsl:copy-of select="."/> </xsl:template> <!-- Add html element at top --> <xsl:template match="/"> <html> <xsl:apply-templates/> </html> </xsl:template> </xsl:stylesheet> After the stylesheet is applied to the annchovie.xml file (see Listing 15-2), the resulting document is shown here as XML:
<html xmlns="http://www.w3.org/TR/REC-html40" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word"> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> <title>Ann Chovie</title> <xml> <w:WordDocument> <w:View>Print</w:View> <w:Zoom>150</w:Zoom> <w:DoNotOptimizeForBrowser></w:DoNotOptimizeForBrowser> </w:WordDocument> </xml> <xml> <o:DocumentProperties> <o:Author>Ann Chovie</o:Author> <o:Revision>2</o:Revision> <o:Company>Fisher Brothers</o:Company> </o:DocumentProperties> </xml><style> p.MsoNormal, li.MsoNormal, div.MsoNormal {mso-style-parent:""; margin-top:0in; margin-right:0in; margin-bottom:12.0pt; margin-left:0in; mso-pagination:widow-orphan; font-size:12.0pt; font-family:"Times New Roman"; mso-fareast-font-family:"Times New Roman";} p.Address, li.Address, div.Address {mso-style-name:Address; margin-top:0in; margin-right:0in; margin-bottom:12.0pt; margin-left:0in; mso-pagination:widow-orphan; font-size:12.0pt; font-family:"Times New Roman"; mso-fareast-font-family:"Times New Roman";} @page Section1 {size:8.5in 11.0in; margin:1.0in 1.25in 1.0in 1.25in; mso-header-margin:.5in; mso-footer-margin:.5in; mso-paper-source:0;} div.Section1 {page:Section1;} </style></head> <body lang="EN-US" > <div class="Section1"> <p class="Address" align="right" >Ann Chovie</p> <p class="Address" align="right" >233 Phish Lane</p> <p class="Address" align="right" >Guppie Hill, VT 12032</p> <p class="MsoNormal">March 3, 2002</p> <p class="MsoNormal">Dear Editor, </p> <p class="MsoNormal">I am canceling my subscription to <i>Goldfish Monthly</i> due to your recent article on the <b><span xmlns="" >Goldfish of the Year</span></b>. While <span xmlns="" >Billy the Georgian Goldfish</span> may be worthy of some sort of reward, he cannot compete with the likes of <span xmlns="" >Jumping Jack</span> from Jacksonville. </p> <p class="MsoNormal">I am concerned that this contest was not fair. Specifically, in your cover photo, I noticed the pebbles at the bottom of the goldfish bowl spelled out the word <i><span xmlns="" >w-a-t-e-r-g-a-t-e</span></i>, perhaps alluding to some sort of cover-up at your magazine. What's more, Billy's fish face has a haunting resemblance to Richard Nixon. This whole thing is starting to smell fishy to me, so I am demanding a full investigation. </p> <p class="MsoNormal">By the way, I did like your recent article on gourmet guppy food. My fish have not been happier since trying the guppuccino! </p> <p class="MsoNormal">Sincerely,</p> <p class="MsoNormal">Ann Chovie</p> </div> </body> </html> 11
|