XML and SQL Server 2000
The final mode we'll cover is EXPLICIT mode, which enables us to explicitly specify how we want the element nesting to appear in the XML output document. I mentioned in the beginning of this chapter that this mode is one of the more complicated and difficult subjects we will cover in this book. Now that you've heard that, are you wondering why we need another mode to generate XML documents? Look at it this way: RAW mode gives you absolutely no control over the layout or nesting of the XML elements in your documents. You get an element for every row in the rowset, and all data appears as attributes of those elements. AUTO mode gives you limited control over them, although it is much more than RAW mode. You can control whether the document is element-centric or attribute-centric and can adjust SQL queries somewhat to control results. Now we have EXPLICIT mode. This mode gives you total control over every facet of element layout in your documents, and that's the reason it exists. The Universal Table
The EXPLICIT mode requires the that rowset returned from a query be in a special format. This format is called the Universal table, and it is used to produce the output document. Basically, a Universal table is a table created in memory by a series of SQL SELECT statements in a specific format that are combined via SQL UNION clauses. One requirement on the rowset result is that it be ordered so that its children follow the parent element immediately. Also, it must associate the element names with the tag numbers (explained later) and provide attribute names (the default) as the column names of the Universal table. An example of a Universal table is given in Figure 7.2.This table shows the first two columns of metadata required for proper processing, Tag and Parent. Figure 7.2. A sample Universal table specifying column names.
These two columns are the most important in the table. They dictate the parent-child relationship in the resulting XML tree. This example shows how the <Employee> , <Order> , and <OrderDetail> elements are nested. The rest of the Universal table in Figure 7.2 describes the remaining parts of the XML document. This Universal table would generate the sample XML document in Listing 7.20. Listing 7.20 XML Document Generated from the Universal Table in Figure 7.2
<Employee eid="5" name="Nancy"> <Order id="6O185" date="01/01/2001"> <OrderDetail id="OD1" pid="P1" /> <OrderDetail id="OD2" pid="P2" /> </Order> <Order id="O2" date="3/29/1997"> ... </Employee> Here are the key relationships within the table that define the hierarchy of the resulting XML tree that generates the proper nesting.
We'll be going over plenty of examples to help clarify these for you. The column names in a Universal table must be specified in the query. These names follow the special format defined by the following arguments and shown in the legend in Figure 7.2 ElementName!TagNumber!AttributeName!Directive These arguments are defined in Tables 7.1 and 7.2. Table 7.1. Column Name Argument Definitions
Table 7.2. Directive Definitions
Now let's get to those examples I told you about. The approach I recommend when utilizing the EXPLICIT mode is to reverse-engineer the document you want to create. By this I mean you should sketch out the resulting document and then create the SQL query that will generate the document's Univesal table and then the document. Hmmm, two steps!?!? Not to worry! Let's sketch out the document first and then break it down. Listing 7.21 shows it in its final form. Listing 7.21 The Document We Want to Create
<Orders OrderID="10248"> <OrderDetails ProductID="11" /> <OrderDetails ProductID="42" /> <OrderDetails ProductID="72" /> </Orders> <Orders OrderID="10249"> <OrderDetails ProductID="14" /> <OrderDetails ProductID="51" /> </Orders> <Orders OrderID="10250"> <OrderDetails ProductID="41" /> <OrderDetails ProductID="51" /> <OrderDetails ProductID="65" /> </Orders> ... Here's the breakdown of the first steps, which are illustrated in the SQL statement of Listing 7.22.
Now that we have the document broken down, what do we do next ? We write two separate SQL SELECT statements and combine their outputs with a UNION ALL statement. When using multiple SELECT statements, all column names that are attributes of both top-level and child elements must be specified in the first SELECT statement. Any column names in successive queries are discarded. Listing 7.22 shows the first SQL SELECT statement. Here we lay out the universal table by specifying the Tag and Parent columns and assigning the values 1 and NULL, respectively, to them. Next we assign Orders as the top-level element and give it the attribute of OrderID . Lastly, we assign the column name for the ProductID attribute to the table. Listing 7.22 The First Part of the SQL Query
SELECT 1 as Tag, NULL as Parent, Orders.OrderID as [Orders!1!OrderID], NULL as [OrderDetails!2!ProductID] FROM Orders Listing 7.23 shows the second SQL SELECT statement. Here we grab the constants 2 and 1 (for the Tag and Parent columns) and the OrderID and ProductID attributes. Lastly, we compound sort the OrderID column and then the ProductID column. The UNION ALL statement combines the two queries. Table 7.3 shows a partial listing of the universal table generated during query execution. Listing 7.24 shows the combined queries. Listing 7.23 The Second Part of the SQL Query
UNION ALL SELECT 2, 1, Orders.OrderID, [Order Details].ProductID FROM Orders, [Order Details] WHERE Orders.OrderID = [Order Details].OrderID ORDER BY [Orders!1!OrderID], [OrderDetails!2!ProductID] FOR XML EXPLICIT Table 7.3. Partial Universal Table
Listing 7.24 The Entire SQL Query
SELECT 1 as Tag, NULL as Parent, Orders.OrderID as [Orders!1!OrderID], NULL as [OrderDetails!2!ProductID] FROM Orders UNION ALL SELECT 2, 1, Orders.OrderID, [Order Details].ProductID FROM Orders, [Order Details] WHERE Orders.OrderID = [Order Details].OrderID ORDER BY [Orders!1!OrderID], [OrderDetails!2!ProductID] FOR XML EXPLICIT ELEMENT Directive
Now that we've dissected the creation of an XML document utilizing the EXPLICIT mode, let's examine the various directives that were summarized in Table 7.2.We'll start with the ELEMENT directive. Listing 7.25 shows a partial listing of the document we want to create. Notice that OrderID is a subelement of Order and not an attribute. Breaking this down as we did in the previous section, Employee is the top element, so it gets a tag number of 1 and a NULL parent tag value. Order is the first subelement of Employee , so it gets a tag number of 2 and a parent tag value of 1. As we stated previously, OrderID is a subelement of Order , so the ELEMENT directive must be used; otherwise, OrderID will show up as an attribute of Order . Again, we'll use multiple SQL queries and combine them with a UNION ALL statement. Listing 7.31 shows the first SQL SELECT statement. Listing 7.25 The Document to Create with the Element Directive
<Employee LastName="Buchanan"> <Order CustomerID="VINET"> <OrderID>10248</OrderID> </Order> <Order CustomerID="CHOPS"> <OrderID>10254</OrderID> </Order> <Order CustomerID="WHITC"> <OrderID>10269</OrderID> </Order> ... The interpretation of Listing 7.26 is identical to the interpretation given in the previous section, with one exception: The ELEMENT directive forces the OrderID to be a child element of Order and not an attribute. Listing 7.26 The First Part of the SQL Query
SELECT 1 as Tag, NULL as Parent, Employees.LastName as [Employee!1!LastName], NULL as [Order!2!OrderID!element], NULL as [Order!2!CustomerID] FROM Employees Listing 7.27 gives the second SQL SELECT statement, and again, the only difference between this and the previous interpretation is the addition of the ELEMENT directive. A partial universal table generated by these queries is shown in Table 7.4. Table 7.4. Partial Listing of the Generated Univesal table
Listing 7.27 The Second Part of the SQL Query
UNION ALL SELECT 2, 1, Employees.LastName, Orders.OrderID, Orders.CustomerID FROM Orders, Employees WHERE Orders.EmployeeID = Employees.EmployeeID ORDER BY [Employee!1!LastName], [Order!2!OrderID!element] FOR XML EXPLICIT Hide Directive
As stated in Table 7.2, the hide directive is very useful when you want to order a result document by an attribute's value but don't want that attribute to appear in the result. Using the example from the previous section, let's say we want to order the document by Employee LastName and then CustomerID instead of OrderID . Listing 7.28 shows the resulting document. Notice how the OrderIDs have been rearranged. Listing 7.28 Partial Results of Using the hide Directive
<Employee LastName="Buchanan"> <Order OrderID="10654" /> <Order OrderID="10866" /> <Order OrderID="10297" /> <Order OrderID="10730" /> <Order OrderID="10254" /> ... The query we utilize to accomplish this output is given in Listing 7.29. Here we've placed the hide directive with the CustomerID directive and changed the ORDER BY clause to do a secondary sort on this attribute. This prevents the CustomerID attribute from appearing in the output. Listing 7.29 Query Utilizing the hide Directive
SELECT 1 as Tag, NULL as Parent, Employees.LastName as [Employee!1!LastName], NULL as [Order!2!OrderID], NULL as [Order!2!CustomerID!hide] FROM Employees UNION ALL SELECT 2, 1, Employees.LastName, Orders.OrderID, Orders.CustomerID FROM Orders, Employees WHERE Orders.EmployeeID = Employees.EmployeeID ORDER BY [Employee!1!LastName], [Order!2!CustomerID!hide] FOR XML EXPLICIT CDATA Directive
The CDATA directive simply takes the entity you assign it to and encases its value in a CDATA section. Listing 7.30 shows a sample SQL query utilizing the CDATA directive followed by the resulting XML document in Listing 7.31. As far as breaking down the desired resulting document and generating the query, nothing has changed. Listing 7.30 Query Utilizing the CDATA Directive
SELECT 1 as Tag, NULL as Parent, Employees.LastName as [Employee!1!LastName], Employees.Address as [Employee!1!!CDATA] FROM Employees ORDER BY [Employee!1!LastName] FOR XML EXPLICIT Listing 7.31 Partial Results of the CDATA Directive
<Employee LastName="Buchanan"> <![CDATA[ 14 Garrett Hill ]]> </Employee> <Employee LastName="Callahan"> <![CDATA[ 4726 - 11th Ave. N.E. ]]> </Employee> <Employee LastName="Davolio"> <![CDATA[ 507 - 20th Ave. E. Apt. 2A ]]> </Employee> ... The ID , IDREF , and IDREFS Directives
As a brief review, ID , IDREF , and IDREFS are employed in XML documents to provide intradocument links. The ID type is applied to an element's attribute and distinguishes it from all other elements in the document. An IDREF attribute references an ID element in another part of the document (there's your intradocument link). An IDREFS attribute references multiple whitespace-separated IDs in the documents. First let's look at an IDREF example. Listing 7.32 shows the XML document we want to generate. In this document, EmployeeID of the Employee element will be of type ID . EmployeeID of the Order element will be of type IDREF . In our SQL statement, we'll specify the XMLDATA argument to return the document's XDR schema.This will clearly show the ID and IDREF references. Listing 7.32 Partial Document Containing ID and IDREF Directives
<Employee EmployeeID="1"> <Order EmployeeID="1" CustomerID="ERNSH"> <OrderID>10258</OrderID> </Order> <Order EmployeeID="1" CustomerID="WARTH"> <OrderID>10270</OrderID> </Order> <Order EmployeeID="1" CustomerID="MAGAA"> <OrderID>10275</OrderID> </Order> ... In the SQL statement shown in Listing 7.33, you can see the ID and IDREF directives that will generate the proper data type in the XDR schema. Running this query in the Query Analyzer gives you Listing 7.34, which shows a partial listing of the results. The XDR schema is included.The data types of ID and IDREF can be seen there. Listing 7.33 The SQL Statement We Need
SELECT 1 as Tag, NULL as Parent, Employees.EmployeeID as [Employee!1!EmployeeID!id], NULL as [Employee!1!LastName], NULL as [Order!2!OrderID!element], NULL as [Order!2!EmployeeID!idref], NULL as [Order!2!CustomerID] FROM Employees UNION ALL SELECT 2, 1, Employees.EmployeeID, Employees.LastName, Orders.OrderID, Orders.EmployeeID, Orders.CustomerID FROM Orders, Employees WHERE Orders.EmployeeID = Employees.EmployeeID ORDER BY [Employee!1!EmployeeID!id], [Order!2!OrderID!element] FOR XML EXPLICIT, XMLDATA Listing 7.34 Partial Results Listing
<Schema name="Schema2" xmlns="urn:schemas-microsoft-com:xml-data" xmlns:dt="urn:schemas-microsoft-com:datatypes"> <ElementType name="Employee" content="mixed" model="open"> <AttributeType name="EmployeeID" dt:type="id" /> <AttributeType name="LastName" dt:type="string" /> <attribute type="EmployeeID" /> <attribute type="LastName" /> </ElementType> <ElementType name="Order" content="mixed" model="open"> <AttributeType name="EmployeeID" dt:type="idref" /> <AttributeType name="CustomerID" dt:type="string" /> <element type="OrderID" /> <attribute type="EmployeeID" /> <attribute type="CustomerID" /> </ElementType> <ElementType name="OrderID" content="textOnly" model="closed" dt:type="i4" /> </Schema> <Employee xmlns="x-schema:#Schema2" EmployeeID="1"> <Order EmployeeID="1" CustomerID="ERNSH"> <OrderID>10258</OrderID> </Order> <Order EmployeeID="1" CustomerID="WARTH"> <OrderID>10270</OrderID> </Order> <Order EmployeeID="1" CustomerID="MAGAA"> <OrderID>10275</OrderID> </Order> ... Now, for an IDREFS example, Listings 7.35, 7.36, and 7.37 show the document we want, the necessary SQL statement, and a very hacked-up partial result listing that includes the XDR schema. The slicing and dicing of the result was necessary because the result is a very large file. In this example, we declare the OrderID attribute of the Orders table to be of type ID . The Employee element contains an OrderList attribute of type IDREFS that lists all of the OrderIDs of the particular employee. Would this particular example be useful in some way? Probably not, but it does show a thorough example of the IDREFS type. Listing 7.35 Partial Document Containing ID and IDREF Directives
<Employee xmlns="x-schema:#Schema3" EmployeeID="1" LastName="Davolio" OrderList="OID-10258 OID-10270 OID-10275...OID-11077"> <Order OrderID="OID-10258" /> <Order OrderID="OID-10270" /> <Order OrderID="OID-10275" /> ... <Order OrderID="OID-11077" /> </Employee> <Employee xmlns="x-schema:#Schema3" EmployeeID="2" LastName="Fuller" OrderList="OID-10265 OID-10277 OID-10280...OID-11073"> ... Listing 7.36 The SQL Statement We Need
SELECT 1 as Tag, NULL as Parent, Employees.EmployeeID as [Employee!1!EmployeeID], Employees.LastName as [Employee!1!LastName], NULL as [Employee!1!OrderList!idrefs], NULL as [Order!2!OrderID!id] FROM Employees UNION ALL SELECT 1 as Tag, NULL as Parent, Employees.EmployeeID as [Employee!1!EmployeeID], Employees.LastName as [Employee!1!LastName], 'OID-' + CAST(Orders.OrderID as varchar(5)), NULL FROM Employees join Orders on Employees.EmployeeID = Orders.EmployeeID UNION ALL SELECT 2, 1, Employees.EmployeeID, Employees.LastName, NULL, 'OID-' + CAST(Orders.OrderID as varchar(5)) FROM Employees join Orders on Employees.EmployeeID = Orders.EmployeeID WHERE Orders.EmployeeID = Employees.EmployeeID ORDER BY [Employee!1!EmployeeID], [Order!2!OrderID!id], [Employee!1!OrderList!idrefs] FOR XML EXPLICIT, XMLDATA Listing 7.37 Partial Results Listing with XDR Schema
<Schema name="Schema3" xmlns="urn:schemas-microsoft-com:xml-data" xmlns:dt="urn:schemas-microsoft-com:datatypes"> <ElementType name="Employee" content="mixed" model="open"> <AttributeType name="EmployeeID" dt:type="i4" /> <AttributeType name="LastName" dt:type="string" /> <AttributeType name="OrderList" dt:type="idrefs" /> <attribute type="EmployeeID" /> <attribute type="LastName"/> <attribute type="OrderList"/> </ElementType> <ElementType name="Order" content="mixed" model="open"> <AttributeType name="OrderID" dt:type="id" /> <attribute type="OrderID" /> </ElementType> </Schema> <Employee xmlns="x-schema:#Schema3" EmployeeID="1" LastName="Davolio" OrderList="OID-10258 OID-10270 OID-10275...OID-11077"> <Order OrderID="OID-10258" /> <Order OrderID="OID-10270" /> <Order OrderID="OID-10275" /> ... <Order OrderID="OID-11077" /> </Employee> <Employee xmlns="x-schema:#Schema3" EmployeeID="2" LastName="Fuller" OrderList="OID-10265 OID-10277 OID-10280...OID-11073"> <Order OrderID="OID-10265" /> <Order OrderID="OID-10277" /> <Order OrderID="OID-10280"/> ... <Order OrderID="OID-11073" /> </Employee> ... Retrieving Unconsumed Data with sql:overflow-field
In Chapter 5, "Creating XML Views with XDR Schemas," I said that I wanted to hold off on discussing overflow data until Chapter 7. Well, it's about time for that discussion now because the next section deals with this topic. The OPENXML extension, which we'll discuss in detail in Chapter 8, "OPENXML," allows record insertion into a database from an XML document. Any data from that document that does not have a related column for storage can be stored in a column within that database known as the unconsumed data column. XDR schemas have the annotation sql:overflow-field , which identifies this column. This data can be recovered into an XML document in two ways:
We'll have to change our normal sample methods for this exercise because none of the Northwind database's tables contain an overflow column. This isn't a big deal, though, because it's very simple to illustrate the principle. See Listing 7.38 for the SQL statement that creates our sample table. Listing 7.39 contains the schema file that will create an XML document including the overflow data. The results are given in Listing 7.41. Listing 7.38 SQL that Creates an Overflow Data Column
CREATE TABLE Company ( CompanyID varchar(10), CompanyName varchar(20), Employee varchar(15), Overflow nvarchar(100)) GO INSERT INTO Company VALUES( '12345', 'Fly By Night, Inc.', 'Judith Griffin', N'<phone>555-1212</phone>') GO Listing 7.39 Schema with the sql:overflow-field Annotation
<?xml version="1.0" ?> <Schema xmlns="urn:schemas-microsoft-com:xml-data" xmlns:dt="urn:schemas-microsoft-com:datatypes" xmlns:sql="urn:schemas-microsoft-com:xml-sql"> <ElementType name="Company" sql:overflow-field="Overflow" > <AttributeType name="CompanyName" /> <AttributeType name="Employee" /> <attribute type="CompanyName" /> <attribute type="Employee" /> </ElementType> </Schema> Save this schema as schema.xml in the schema virtual directory. Create the template file given in Listing 7.40 and execute it with the following: http://iisserver/Nwind/schema/schema.xml Listing 7.40 Template File to Execute Listing 7.39's Schema
<ROOT xmlns:sql="urn:schemas-microsoft-com:xml-sql"> <sql:xpath-query mapping-schema="../schemas/Schema.xml"> Company </sql:xpath-query> </ROOT> Listing 7.41 Results from Our Table
<ROOT xmlns:sql="urn:schemas-microsoft-com:xml-sql"> <Company CompanyName="Fly By Night, Inc." Employee="Judith Griffin"> <phone>555-1212</phone> </Company> </ROOT> Now that we've covered the basics of the overflow field, we can cover the last remaining directive of Table 7.2. XMLTEXT Directive
Let's use the same table we used in the preceding section for examples here. First trun-cate the table to clear out remaining data and then issue these SQL statements to populate the table with fresh data, as shown in the following example. Listing 7.42 shows the SQL statement to create the Universal table. INSERT INTO Company VALUES('C1', 'Fly by Night', 'Larry' ,N'<starttag a1="attr1">safe</starttag>') INSERT INTO Company VALUES('C2', 'Limo to Nowhere', 'Moe' ,N'<starttag a2="attr2">sane</starttag>') INSERT INTO Company VALUES('C3', 'Key Lime Pie Faces', 'Curly' ,N'<starttag a3="attr3" CompanyID="C">clean</starttag>') Listing 7.42 SQL Statement to Create the Universal Table
SELECT 1 as Tag, NULL as parent, CompanyID as [Parent!1!CompanyID], CompanyName as [Parent!1!CompanyName], Employee as [Parent!1!Employee], Overflow as [Parent!1!!xmltext] No AttributeName; xmltext directive FROM Company FOR XML EXPLICIT Listing 7.42 produces the output in the following example: <Parent Companied="C1" CompanyName="Fly by Night" Employee="Larry" a1="attr1">safe</Parent> <Parent Companied="C2" CompanyName="Limo to Nowhere" Employee="Moe" a2="attr2">sane</Parent> <Parent CompanyID="C3" CompanyName="Key Lime Pie Faces" Employee="Curly" a3="attr3">clean</Parent> There are two things to take note of here. First, if AttributeName isn't specified and XMLTEXT is, then the overflow column attributes will be appended to the parent element's attributes list. Second, in the event that the overflow column has an attribute with the same name as an attribute belonging to the parent element, the parent element's attribute will take precedence. In the event that AttributeName is specified with the XMLTEXT directive, the name given by AttributeName becomes a child element, and the overflow column's attributes become attributes of this child element.The following example shows the SQL used to populate the table. INSERT INTO Company VALUES('C1', 'Fly by Night', 'Larry' ,N'<starttag a1="attr1">safe</starttag>') INSERT INTO Company VALUES('C2', 'Limo to Nowhere', 'Moe' ,N'<starttag a2="attr2">sane</starttag>') INSERT INTO Company VALUES('C3', 'Key Lime Pie Faces', 'Curly' ,N'<starttag a3="attr3" CompanyID="C"><state>clean</state></starttag>') Notice the differences between this listing and the sample SQL statements used to populate the table with fresh data earlier in this section. In the third INSERT statement, we add an additional element to the overflow column and specify overflow as the AttributeName . We'll use the SQL statements in Listing 7.43 to execute our query; the results are shown in Listing 7.44. Listing 7.43 The SQL Statement to Execute
SELECT 1 as Tag, NULL as parent, CompanyID as [Parent!1!CompanyID], CompanyName as [Parent!1!CompanyName], Employee as [Parent!1!Employee], Overflow as [Parent!1!overflow!xmltext] overflow is the AttributeName; xmltext directive accompanies FROM Company FOR XML EXPLICIT Listing 7.44 Results of Executing the Listing 7.43 SQL Statement
<Parent CompanyID="C1" CompanyName="Fly by Night" Employee="Larry"> <overflow a1="attr1">safe</overflow> </Parent> <Parent CompanyID="C2" CompanyName="Limo to Nowhere" Employee="Moe"> <overflow a2="attr2">sane</overflow> </Parent> <Parent CompanyID="C3" CompanyName="Key Lime Pie Faces" Employee="Curly"> <overflow a3="attr3" CompanyID="C"> <state>clean</state> </overflow> </Parent> Well, that completes Chapter 7. Next, in Chapter 8, we move on to writing data to tables utilizing XML documents along with additional methods of generating documents. |