XML: A Managers Guide (2nd Edition) (Addison-Wesley Information Technology Series)
The three standards covered in the previous section enable you to select relatively small parts of documents in various ways. In some cases, you may want to take this manipulation a step further by completely reorganizing a document. This type of transformation requires selecting different parts of a document and rearranging them. XSL Transformations ( XSLT ) extends the XPath model of how to address parts of documents with more sophisticated operators that enable developers to specify these rearrangements. You may be wondering why, as the name implies, XSLT is associated with XSL. Originally, people saw XSL as a generic way to display XML documents for all presentation technologies. Accomplishing this goal naturally required two different types of features: (1) features for rearranging document content so that it made the most sense for display and (2) features for attaching display properties to the content. However, rearranging content turned out to be useful for other purposes, and the appropriate display properties turned out to be a topic of extensive debate. Because people wanted to use the rearranging features and agreement on these features was far ahead of that for display properties, the two standards separated and XSLT achieved W3C Recommendation status ahead of XSL. One of the greatest demands for XSLT came from trying to avoid the meta- incompatibility problem where two different organizations use two different, so called, standards. XML enhances compatibility by making it possible to exchange information using a standard syntax and to define standard formats that constrain the structure of this information. But what if two different groups define different formats for the same logical type of document? That leads to meta-incompatibility and big headaches for managers who are developing applications that must exchange documents with both groups. In some sense, this is the opposite problem from the one solved by Namespaces. Namespaces solve the problem of groups calling different concepts by the same name. XSLT solves the problem of groups calling the same concept by different names . Such a scenario is highly likely in applications such as supply chain management where two companies want to exchange conceptually the same information but already have formats that they use internally. Also, industry groups in finance, telecommunications, and transportation have defined formats for transactions in those industries. In some cases, multiple industry groups are working on the same problem, creating the potential for dueling standards. Moreover, with the integration of global supply chains, standards for related industries such as manufacturing and shipping may need to ensure compatibility where they overlap. The idea behind XSLT is to define a scripting language ”using XML syntax, of course ”that enables developers to transform one format into another. Wherever two formats overlapped , developers would create a transform that extracts the overlapping information from one format and rearranges it into the other format. These transforms are directional; to rearrange documents in both directions, you would need two separate transforms. How It Works
Consider the basic problem of automatically placing an order with a trading partner over the Internet. Foo Company has defined a Foo Company Order DTD that it uses internally. Bar Corp has defined a Bar Corp Order DTD that it uses internally. Now Foo Company wants to place orders automatically with Bar Corp. When Foo Company creates a Foo Company Order Document, it is valid with respect to the Foo Company Order DTD. However, for Bar Company to accept the order, the order document must be valid with Corp to the Bar Corp Order DTD. To achieve this end, Foo Company creates a Foo Bar Transformation document that specifies how to translate a Foo Company Order Document into a Bar Corp Order Document as shown in Figure 3-3. Figure 3-3. Translating Order Formats with XSLT
To see how XSLT works, let's consider a simple example. Examples 3-9a and 3-9b show parts of order documents in two different formats. Example 3-9a models currency information as an attribute on the "Order" element. Example 3-9b models currency information as a child element of the "Order" element. The choice of modeling information as an attribute or child element is an arbitrary one, so it is likely that two different DTDs for the same concept would choose different modeling techniques for at least one piece of information. Example 3-9a
<Order currency="USD"> ... </Order> Example 3-9b
<Order> <Currency>USD</Currency> ... </Order> Example 3-10 is the XSLT code for transforming the attribute model of Example 3-9a to the child element model of Example 3-9b. Note the use of the XSL namespace to denote the elements using XSL-specific constructs. The transformation document selects the "Order" element in the source document. It then begins a new "Order" element with a new "Currency" child element in the translated document. It inserts the value of the "Currency" attribute of the selected order element in the source document as the element content of the "Currency" element in the translated document. Example 3-10
<xsl:template match="/"> <xsl:for-each select="Order"> <Order> <Currency> <xsl:value-of select="@currency"/> </Currency> </Order> </xsl:for-each> </xsl:template> As you can see, XSLT is itself XML. Every XSLT document is also an XML document. Therefore, all the tools for creating and managing XML documents work with XSLT documents. Of course, they do not necessarily understand the specific XSLT syntax, but they do provide some leverage. Although this example is motivated by valid documents that use different DTDs, XSLT scripts also work with well- formed ones. However, people often use XSLT when they have in one data format many documents that they want translated to another data format ”precisely the same conditions under which they use valid documents. At the time of this writing, the W3C had commenced work on XSLT 2.0. The work was in its beginning stages, focusing on the requirements for the new version. Given the work on XML Schema and XPath 2.0 since XSLT 1.0 appeared as a Recommendation, one of the primary requirements was compatibility with the rest of the XML standards family. The rest of the proposed requirements mostly revolved around making XSLT easier to use. Since the release of XSLT 1.0, a great deal has been revealed about what people want to do with XSLT and what they are finding they can do. So there are a number of proposals to close the most glaring gaps between frequency of need to perform a type of operation and the difficulty of actually performing the operation. Practical Usage
XSLT is becoming an increasingly important part of the XML family of specifications. XML provides a general grammar for defining data formats, but as data moves through an organization, different consumers of the data will probably want it in different formats. XSLT provides the mechanism for supporting this customized data flow. It fundamentally alters the issue of information exchange from defining common data formats for all applications to defining the transformations necessary to deliver data to each application in the format it desires. Another common use for XSLT is transforming data documents into presentation documents. There are a number of presentation formats based on tagged markup. HTML is by far the most widely used, but there is also Wireless Markup Language (WML) for small devices and VoiceXML for voice-driven interfaces. The W3C has redefined HTML in terms of XML with its XHTML initiative, while WML and VoiceXML are already defined in terms of XML. Therefore, it is fairly straightforward to create XSLT transforms that take a data-oriented format like an Order and rearrange it into a presentation-oriented format like XHTML. It's even possible to create pages that incorporate advanced features like JavaScript or VBScript. Figure 3-4 shows the typical architecture of an XSLT-driven Web site. Note that the XSLT transformation from XML data into XHTML presentation takes place on the server rather than the client. While this approach violates the vision of users specifying their own presentation, it is much more convenient to do transforms on the server with the current Internet software infrastructure. Some projects have even combined XSLT with Cascading Style Sheets (CSS). CSS enables authors to specify detailed screen presentation properties for HTML elements. One drawback of CSS is that it can only add formatting information to a document; it cannot rearrange the document information to make more sense to the user. In this sense, XSLT and CSS complement each other. The application reorganizes the data-oriented XML with XSLT to suit the user requirements and then applies CSS to get the final presentation-oriented document. Figure 3-4. Driving Web Site Display with XSLT
There is some backlash against using XSLT. Some experts observe that XSLT is evolving into a complete programming language even though existing programming languages do almost as good a job of transforming XML documents. They believe that the small benefit offered by XSLT is overwhelmed by the cost of forcing developers to learn yet another special-purpose programming language. Finally, they feel the performance of XSLT is dismal when compared to the same operations in traditional programming languages. Others argue that because of the complexity of the low-level interfaces, using existing programming languages requires a high degree of skill that many Web developers do not have. They point to the success of JavaScript and to HTML itself as evidence that special-purpose scripting languages improve the reach of information exchange technologies. Last, they point out that people also complained about Java's initial performance and that the performance of XSLT will also certainly improve as it matures. Given this debate it's probably worth encouraging your architects and lead developers to examine carefully whether XSLT is right for a particular project. |