XML Hacks: 100 Industrial-Strength Tips and Tools

   

Transform elements into attributes and back the other way with XSLT.

You're sitting in a conference room, leaning back in your chair. Opinions are flying back and forth across the room about whether to represent the XML data from a new application in either element or attribute form.

One engineer says, combing his beard with his fingers, "You don't want to use attributes at all. What if down the line you need more than one attribute with the same name. You can't do that in XML. You can only use one attribute with a given name."

"Attributes contain metadata about elements," another barks. "You don't store metadata in element content. Period. That's where the real data goes."

You rock forward in your chair. "Excuse me," you say with a chuckle, "but none of these arguments matter." The room goes silent. Your project manager's nostrils flare. "You better explain yourself," she says, taking the last swig of her spring water.

"Gladly," you say. "I've got a pair of XSLT stylesheets that can transform the data easily between element and attribute forms in seconds. Walk with me to my cubicle and I'll give you a demo."

In reference to the element-or-attribute debate, Michael Kay has wisely said: "Beginners always ask this question. Those with a little experience express their opinions passionately. Experts tell you there is no right answer" (http://lists.xml.org/archives/xml-dev/200006/msg00285.html). This hack will allow you to keep changing your mind.

3.11.1 Element-to-Attribute Conversion

XML document design does matter, and it's worthwhile to consider some questions when deciding between elements and attributes:

  • Are you dealing with data or metadata (data about data)? Elements are generally a good fit for data, and attributes are a good fit for metadata.

  • Is there a possibility of name conflicts when labeling data? If so, remember that you can have only one attribute with a given name per element.

  • Should the data be structured (i.e., does it have a logical relationship with nearby markup)? You can't use XML to structure attribute values.

It's nice to come away from a meeting like that sounding and looking like a genius. That's one reason why this book contains a hack on how to use XSLT to convert XML elements to attributes or attributes to elements. You'll recall our tried and true time.xml document:

<?xml version="1.0" encoding="UTF-8"?> <!-- a time instant --> <time timezone="PST"> <hour>11</hour> <minute>59</minute> <second>59</second> <meridiem>p.m.</meridiem> <atomic signal="true"/> </time>

Let's say you want to convert the elements hour, minute, second, and so forth into attributes. You can do it with the stylesheet elem2attr.xsl shown in Example 3-17.

Example 3-17. elem2attr.xsl

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output method="xml" encoding="UTF-8" indent="yes"/> <xsl:template match="time"> <xsl:copy> <xsl:copy-of select="@timezone"> <xsl:for-each select="*"> <xsl:attribute name="{name(.)}"> <xsl:value-of select="."/> <xsl:value-of select="@signal"/> </xsl:attribute> </xsl:for-each> </xsl:copy> </xsl:template> </xsl:stylesheet>

This stylesheet could be adapted to match the needs of your XML data. On line 4, the single template in this stylesheet matches the document element time (a built-in template first matches the root node, though this is not evident from the markup). The stylesheet copies the time element (line 5), and the copy-of on line 6 copies over the timezone attribute (line 6). Then the for-each element marches through all the child elements (select="*") of time (line 7). For each element found, an attribute is created (line 8), using the element names as attribute names (that's what the name() function does). The element content is retrieved with value-of (line 9), as well as the value of the signal attribute (using @signal on line 10).

Apply this stylesheet to time.xml using Xalan with this command:

xalan -o attr.xml time.xml elem2attr.xsl

You can see in attr.xml that all the element names and content have been converted to attribute names and values (plus the timezone attribute has been carried over):

<?xml version="1.0" encoding="UTF-8"?> <time timezone="PST" hour="11" minute="59" second="59" meridiem="p.m." atomic="true"/>

A little information is lost in the transformation: the signal attribute's value of true is assigned to the new attribute atomic. This information is restored with the next stylesheet, attr2elem.xsl.

3.11.2 Attribute-to-Element Conversion

Now let's take it back in the other direction with attr2elem.xsl, shown in Example 3-18.

Example 3-18. attr2elem.xsl

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output method="xml" encoding="UTF-8" indent="yes"/> <xsl:template match="time"> <xsl:comment> a time instant </xsl:comment> <xsl:copy> <xsl:copy-of select="@timezone"/> <xsl:apply-templates select="@*"/> </xsl:copy> </xsl:template> <xsl:template match="@*"> <xsl:element name="{name(.)}"> <xsl:value-of select="."/> </xsl:element> </xsl:template> <xsl:template match="@timezone"/> <xsl:template match="@atomic"> <xsl:element name="{name(.)}"> <xsl:attribute name="signal"><xsl:value-of select="."/> </xsl:attribute> </xsl:element> </xsl:template> </xsl:stylesheet>

As with the previous stylesheet in this hack, adapt this one to your needs. The first template matches time (line 4). A comment is created (line 5), the time element is copied into the result tree (line 6), and the timezone attribute is re-created on time (line 7). Then apply-templates selects all attributes associated with time (@* on line 9).

The template on line 13 matches all attributes and creates an element for each one found using name(.). When apply-templates on line 9 selects @timezone, it finds another template for the timezone attribute (line 19), more specific than the one on line 13, which just no-ops. This is done because the stylesheet already re-created the timezone attribute on line 7. (The timezone attribute must be re-created before any elements are created, which is why the stylesheet deals with it explicitly rather than leaving it to the templates that follow.)

When apply-templates selects @atomic, the template on line 21 is instantiated, which creates an atomic element with a signal attribute, just as in time.xml.

Apply this to attr.xml using:

xalan -i 1 attr.xml attr2elem.xsl

and you will get the following result, which looks an awful lot like time.xml:

<?xml version="1.0" encoding="UTF-8"?> <!-- a time instant --> <time timezone="PST"> <hour>11</hour> <minute>59</minute> <second>59</second> <meridiem>p.m.</meridiem> <atomic signal="true"/> </time>

3.11.3 See Also

  • Sal Mangano's XSLT Cookbook (O'Reilly), pages 202-206, which inspired this hack

  • Peter Flynn's XML FAQ, "Which should I use in my DTD, attributes or elements?": http://www.ucc.ie/xml/#attriborelem

  • Robin Cover's CoverPages "Using Elements and Attributes": http://xml.coverpages.org/elementsAndAttrs.html

  • Uche Ogbuji's IBM developerWorks article "When to Use Elements Versus Attributes": http://www-106.ibm.com/developerworks/xml/library/x-eleatt.html

Категории

© amp.flylib.com,