XML

Overview

XML is the lingua franca of application development—a common syntax that underlies Web services, Microsoft ADO.NET, and a slew of cross-platform programming initiatives. At times, the sheer number of XML extensions and grammars can be overwhelming. Common XML tasks don't just include parsing an XML file, but also validating it against a schema, applying an XSL transform to create a new document or HTML page, and searching intelligently with XPath. All of these topics are covered in this chapter.

The Microsoft .NET Framework includes a rich complement of classes for manipulating XML documents in the System.Xml group of namespaces. These namespaces, outlined in the following list, contain the classes we concentrate on in this chapter.

Many of the examples in this chapter require a sample XML document. The sample we will use is called orders.xml. It contains a simple list of ordered items along with information about the ordering client, and it's shown here

CompuStation Calculator 24.99 Laser Printer 400.75

  Note

Before using the examples in this chapter, you should import the System.Xml namespace.

Load an XML Document into Memory

Problem

You need to load an XML document into memory, perhaps so you can browse its nodes, change its structure, or perform other operations.

Solution

Use the XmlDocument class, which provides a Load method for retrieving XML information and a Save method for storing it.

Discussion

.NET provides a slew of XML objects. The ones you use depend in part upon your programming task. The XmlDocument class provides an in-memory representation of XML. It allows you to deal with XML data in your application as XML. The XmlDocument class also allows you to browse through the nodes in any direction, insert and remove nodes, and change the structure on the fly. These tasks are not as easy with the simpler XmlTextWriter and XmlTextReader classes, which are explained in recipe 6.6.

To use the XmlDocument class, simply create a new instance of the class, and call the Load method with a filename, Stream, TextReader, or XmlReader object. You can even supply a URL that points to an XML document. The XmlDocument instance will be populated with the tree of elements, or nodes. The jumping-off point for accessing these nodes is the root element, which is provided through the XmlDocument.DocumentElement property. DocumentElement is an XmlElement object that can contain one or more nested XmlNode objects, which in turn can contain more XmlNode objects, and so on. An XmlNode is the basic ingredient of an XML file and can be an element, an attribute, a comment, or contained text. Figure 6-1 shows part of the hierarchy created by XmlDocument for the orders.xml file.

Figure 6-1: A partial tree of the orders.xml document loaded into an XmlDocument.

When dealing with an XmlNode or a class that derives from it (such as XmlElement or XmlAttribute), you can use the following basic properties:

The following code loads the orders.xml document into memory and displays some information from the node tree.

Public Module XmlDocumentTest Public Sub Main() ' Load the document. Dim Doc As New XmlDocument Doc.Load("orders.xml") ' Display some information from the document. Dim Node As XmlNode Node = Doc.DocumentElement Console.WriteLine("This is order " & Node.Attributes(0).Value) For Each Node In Doc.DocumentElement.ChildNodes Select Case Node.Name Case "Client" Console.WriteLine("Prepared for " & _ Node.ChildNodes(0).ChildNodes(0).Value) Case "Items" Console.WriteLine("Contains " & _ Node.ChildNodes.Count.ToString() & " items") End Select Next Console.ReadLine() End Sub End Module

The output is shown here:

This is order 2003-04-12-4996 Prepared for CompuStation Contains 2 items

Process All Nodes in a Document

Problem

You want to iterate through all nodes in an XML tree and display or modify the related information.

Solution

Create a generic procedure for processing the node, and call it recursively.

Discussion

The XmlDocument stores a tree of XmlNode objects. You can walk through this tree structure recursively to process every node.

For example, consider the following code, which displays information about every node in a document. A depth parameter tracks how many layers deep the nesting is and uses it to format the output with a variable-sized indent.

Public Module XmlOuputTest Public Sub Main() ' Load the document. Dim Doc As New XmlDocument Doc.Load("orders.xml") ' Start the node walk at the root node (depth = 0). DisplayNode(Doc.DocumentElement, 0) Console.ReadLine() End Sub Private Sub DisplayNode(ByVal node As XmlNode, ByVal depth As Integer) ' Define the indent level. Dim Indent As New String(" "c, depth * 4) ' Display the node type. Console.WriteLine(Indent & node.NodeType.ToString() & _ ": <" & node.Name & ">") ' Display the node content, if applicable. If node.Value <> String.Empty Then Console.WriteLine(Indent & "Value: " & node.Value) End If ' Display all nested nodes. Dim Child As XmlNode For Each Child In node.ChildNodes DisplayNode(Child, depth + 1) Next End Sub End Module

When using the orders.xml document, the output is as follows:

Element: Element: Element: Text: <#text> Value: CompuStation Elements: Element: Element: Text: <#text> Value: Calculator Element: Text: <#text> Value: 24.99 Element: Element: Text: <#text> Value: Laser Printer Element: Text: <#text> Value: 400.75

An alternative solution to this problem is to use the XmlTextReader, which always steps through nodes one at a time, in order.

Insert Nodes in an XML Document

Problem

You need to modify an XML document by inserting new data.

Solution

Create the node using the appropriate XmlDocument method (such as CreateElement, CreateAttribute, CreateNode, and so on). Then insert it using the appropriate XmlNode method (such as InsertAfter, InsertBefore, or AppendChild).

Discussion

Inserting a node is a two-step process. You must first create the node, and then you insert it in the appropriate location. Optionally, you can then call XmlDocument.Save to persist changes to a file.

To create a node, you use one of the XmlDocument methods that starts with the word Create, depending on the type of node. This ensures that the node will have the same namespace as the rest of the document. Next you must find a suitable related node and use one of its insertion methods to add the new node to the tree. The following example demonstrates this technique to add a new item:

Public Module XmlInsertTest Public Sub Main() ' Load the document. Dim Doc As New XmlDocument Doc.Load("orders.xml") ' Create a new element. Dim ItemNode As XmlNode ItemNode = Doc.CreateElement("Item") ' Add the attribute. Dim Attribute As XmlAttribute Attribute = Doc.CreateAttribute("id") Attribute.Value = "4312" ItemNode.Attributes.Append(Attribute) ' Create and add the sub-elements for this node. Dim NameNode, PriceNode As XmlNode NameNode = Doc.CreateElement("Name") PriceNode = Doc.CreateElement("Price") ItemNode.AppendChild(NameNode) ItemNode.AppendChild(PriceNode) ' Add the text data. NameNode.AppendChild(Doc.CreateTextNode("Stapler")) PriceNode.AppendChild(Doc.CreateTextNode("12.20")) ' Add the new element. ' In this case, we add it as a child at the end of the item list. Doc.DocumentElement.ChildNodes(1).AppendChild(ItemNode) ' Save the document. Doc.Save("orders.xml") Console.WriteLine("Changes saved.") Console.ReadLine() End Sub End Module

The new document looks like this:

CompuStation Calculator 24.99 Laser Printer 400.75 Stapler 12.20

Alternatively, you might be able to use CloneNode, which creates an exact copy of a node, to simplify the task of adding similar data. CloneNode accepts a Boolean depth parameter. If you supply True, CloneNode will duplicate the entire branch, with all nested nodes. Here's the equivalent code using CloneNode:

' Load the document. Dim Doc As New XmlDocument Doc.Load("orders.xml") ' Create a new element based on an existing product. Dim ItemNode As XmlNode ItemNode = Doc.DocumentElement.ChildNodes(1).LastChild.CloneNode(True) ' Modify the node data. ItemNode.Attributes(0).Value = "4312" ItemNode.ChildNodes(0).ChildNodes(0).Value = "Stapler" ItemNode.ChildNodes(1).ChildNodes(0).Value = "12.20" ' Add the new element. Doc.DocumentElement.ChildNodes(1).AppendChild(ItemNode) ' Save the document. Doc.Save("orders.xml")

Notice that in this case, certain assumptions are being made about the existing nodes (for example, that the first child in the item node is always the name, and the second child is always the price). If this assumption isn't guaranteed to be true, you might need to examine the node name programmatically.

Find Specific Elements by Name

Problem

You need to retrieve a specific node from an XmlDocument, and you know its name but not its position.

Solution

Use the XmlDocument.GetElementsByTagName method.

Discussion

The XmlDocument class provides a convenient GetElementsByTagName method that searches an entire document for nodes that have the indicated element name. It returns the results as a collection of XmlNode objects.

This code demonstrates how you could use GetElementsByTagName to calculate the total price of an order:

Public Module XmlSearchTest Public Sub Main() ' Load the document. Dim Doc As New XmlDocument Doc.Load("orders.xml") ' Retrieve all prices. Dim PriceNodes As XmlNodeList PriceNodes = Doc.GetElementsByTagName("Price") Dim PriceNode As XmlNode Dim Price As Decimal For Each PriceNode In PriceNodes Price += Decimal.Parse(PriceNode.ChildNodes(0).Value) Next Console.WriteLine("Total order costs: " & Price.ToString()) Console.ReadLine() End Sub End Module

If your elements include an attribute of type ID, you can also use a method called GetElementById to retrieve an element that has a matching ID value. However, neither method allows you the flexibility to search portions of an XML document—for that flexibility, you need XPath, as described in recipe 6.5.

Find Elements with an XPath Search

Problem

You need to search an XML document or a portion of an XML document for nodes that match certain criteria.

Solution

Use an XPath expression with the SelectNodes or SelectSingleNode method.

Discussion

The XmlNode class defines two methods that perform XPath searches: SelectNodes and SelectSingleNode. These methods operate on all contained child nodes. Because the XmlDocument inherits from XmlNode, you can call XmlDocument.SelectNodes to search an entire document.

Basic XPath syntax uses a pathlike notation. For example, the path /Order/Items/Item indicates an Item element that is nested inside an Items element, which, in turn, in nested in a root Order element. This is an absolute path. The following example uses an XPath absolute path to find the name of every item in an order.

Public Module XPathSearchTest Public Sub Main() ' Load the document. Dim Doc As New XmlDocument Doc.Load("orders.xml") ' Retrieve the name of every item. ' This could not be accomplished as easily with the ' GetElementsByTagName() method, because Name elements are ' used in Item elements and Client elements. Dim Nodes As XmlNodeList Nodes = Doc.SelectNodes("/Order/Items/Item/Name") Dim Node As XmlNode For Each Node In Nodes Console.WriteLine(Node.InnerText) Next Console.ReadLine() End Sub End Module

XPath provides a rich and powerful search syntax, and it's impossible to explain all of the variations you can use in a short recipe. However, Table 6-1 outlines some of the key ingredients in more advanced XPath expressions and includes examples that show how they would work with the orders.xml document.

Table 6-1: XPath Expression Syntax

Expression

Meaning

/

Starts an absolute path that selects from the root node.

/Order/Items/Item selects all Item elements that are children of an Items element, which is itself a child of the root Order element.

//

Starts a relative path that selects nodes anywhere.

//Item/Name selects all of the Name elements that are children of an Item element, regardless of where they appear in the document.

@

Selects an attribute of a node.

/Order/@id selects the attribute named id from the root Order element.

*

Selects any element in the path.

/Order/* selects both Items and Client nodes because both are contained by a root Order element.

|

Combines multiple paths.

/Order/Items/Item/Name|Order/Client/Name selects the Name nodes used to describe a Client and the Name nodes used to describe an Item.

.

Indicates the current (default) node.

..

Indicates the parent node.

//Name/.. selects any element that is parent to a Name, which includes the Client and Item elements.

[ ]

Define selection criteria that can test a contained node or attribute value.

/Order[@] selects the Order elements with the indicated attribute value.

/Order/Items/Item[Price > 50] selects products above $50 in price.

/Order/Items/Item[Price > 50 and Name="Laser Printer"] selects products that match two criteria.

starts-with

This function retrieves elements based on what text a contained element starts with.

/Order/Items/Item[starts-with(Name, "C")] finds all Item elements that have a name element that starts with the letter C.

position

This function retrieves elements based on position.

/Order/Items/Item[position()=2] selects the second Item element.

count

This function counts elements. You specify the name of the child element to count, or an asterisk (*) for all children.

/Order/Items/Item[count(Price) = 1] retrieves Item elements that have exactly one nested Price element.

  Note

XPath expressions and all element and attribute names that you use inside them are always case sensitive.

Load an XML Document into a Class

Problem

You want to use an XML document to persist information, but interact with the data using a custom object in your code.

Solution

Use the XmlDocument or XmlTextReader class to read XML data, and transfer it into an object. Use XmlDocument or XmlTextWriter class to persist the XML data.

Discussion

It's common to want to work with full-fledged objects in your code and use XML only as a file format for persisting data. To support this design, you can create a class with Save and Load methods. The Save method commits the current data in the object to an XML format, whereas the Load method reads the XML document and uses its data to populate the object.

For example, the data in the orders.xml would require three classes to represent the Order, Item, and Client entities. You might create the Item and Client classes as follows:

Public Class Item Private _ID As String Private _Name As String Private _Price As Decimal Public Property ID() As String Get Return _ID End Get Set(ByVal Value As String) _ID = Value End Set End Property Public Property Name As String Get Return _Name End Get Set(ByVal Value As String) _Name = Value End Set End Property Public Property Price As Decimal Get Return _Price End Get Set(ByVal Value As Decimal) _Price = Value End Set End Property Public Sub New(ByVal id As String, ByVal name As String, _ ByVal price As Decimal) Me.ID = id Me.Name = name Me.Price = price End Sub End Class Public Class Client Private _ID As String Private _Name As String Public Property ID() As String Get Return _ID End Get Set(ByVal Value As String) _ID = Value End Set End Property Public Property Name As String Get Return _Name End Get Set(ByVal Value As String) _Name = Value End Set End Property Public Sub New(ByVal id As String, ByVal name As String) Me.ID = id Me.Name = name End Sub End Class

The Order class would then contain a single Client, and a collection of Item objects. It would also add the Save and Load methods that transfer the data to and from the XML file. Here's an example that supports loading only:

Public Class Order Private _ID As String Private _Client As Client Private _Items() As Item Public Property ID() As String Get Return _ID End Get Set(ByVal Value As String) _ID = Value End Set End Property Public Property Client As Client Get Return _Client End Get Set(ByVal Value As Client) _Client = Value End Set End Property Public Property Items() As Item() Get Return _Items End Get Set(ByVal Value As Item()) _Items = Value End Set End Property Public Sub New(ByVal id As String, ByVal client As Client, _ ByVal items As Item()) Me.ID = id Me.Client = client Me.Items = items End Sub Public Sub New(ByVal xmlFilePath As String) Me.Load(xmlFilePath) End Sub Public Sub Load(ByVal xmlFilePath As String) Dim Doc As New XmlDocument Doc.Load(xmlFilePath) ' Find the Order node. Dim Node As XmlNode Node = Doc.GetElementsByTagName("Order")(0) Me.ID = Node.Attributes(0).Value ' Find the Client node. Node = Doc.GetElementsByTagName("Client")(0) Me.Client = New Client(Node.Attributes(0).Value, Node.InnerText) ' Find the Item nodes. Dim Nodes As XmlNodeList Nodes = Doc.GetElementsByTagName("Item") Dim Items As New ArrayList For Each Node In Nodes Items.Add(New Item(Node.Attributes(0).Value, _ Node.ChildNodes(0).InnerText, _ Decimal.Parse(Node.ChildNodes(1).InnerText))) Next ' Convert the collection of items into a strongly typed array. Me.Items = CType(Items.ToArray(GetType(Item)), Item()) End Sub Public Sub Save(ByVal xmlFilePath As String) ' (Save code omitted.) End Sub End Class

  Note

To improve this design, you might want to substitute the array of Item objects with a strongly typed collection, as described in recipe 3.16.

The client can then use the following code to inspect products, without having to interact with the underlying XML format at all:

Dim XmlOrder As New Order("orders.xml") ' Display the prices of all items. Dim Item As Item For Each Item In XmlOrder.Items Console.WriteLine(Item.Name & ": " & Item.Price.ToString()) Next

There are countless variations of this design. For example, you might create a class that writes a file directly to disk. Or, you might add another layer of abstraction using streams, so that the client could save the serialization data to disk, transmit it to another component, or even add encryption with a CryptoStream wrapper. Alternatively, you could use the XmlSerializer class to automate the work for you, as described in recipe 6.7.

Use XML Serialization with Custom Objects

Problem

You want to use an XML document as a serialization format and load the data into an object for manipulation in your code, preferably with as little code as possible.

Solution

Use XmlSerializer to transfer data from your object to XML, and vice versa.

Discussion

The XmlSerializer class allows you to convert objects to XML data, and vice versa. This process is used natively by Web services and provides a customizable serialization mechanism that won't require a single line of custom code. The XmlSerializer class is even intelligent enough to correctly create arrays when it finds nested elements.

The only requirements for using XmlSerializer are as follows:

To use serialization, you must first mark up your data objects with attributes that indicate the desired XML mapping. These attributes are found in the System.Xml.Serialization namespace and include the following:

For example, the following code shows the classes needed to represent the orders.xml items. In this case, the only attribute that was needed was XmlAttribute, which maps the ID property to an attribute named id. To use the code as written, you must import the System.Xml.Serialization namespace.

Public Class Order Private _ID As String Private _Client As Client Private _Items() As Item _ Public Property ID() As String Get Return _ID End Get Set(ByVal Value As String) _ID = Value End Set End Property Public Property Client() As Client Get Return _Client End Get Set(ByVal Value As Client) _Client = Value End Set End Property Public Property Items() As Item() Get Return _Items End Get Set(ByVal Value As Item()) _Items = Value End Set End Property Public Sub New(ByVal id As String, ByVal client As Client, _ ByVal items As Item()) Me.ID = id Me.Client = client Me.Items = items End Sub Public Sub New() ' (XML serialization requires the default constructor.) End Sub End Class Public Class Item Private _ID As String Private _Name As String Private _Price As Decimal _ Public Property ID() As String Get Return _ID End Get Set(ByVal Value As String) _ID = Value End Set End Property Public Property Name() As String Get Return _Name End Get Set(ByVal Value As String) _Name = Value End Set End Property Public Property Price() As Decimal Get Return _Price End Get Set(ByVal Value As Decimal) _Price = Value End Set End Property Public Sub New(ByVal id As String, ByVal name As String, _ ByVal price As Decimal) Me.ID = id Me.Name = name Me.Price = price End Sub Public Sub New() ' (XML serialization requires the default constructor.) End Sub End Class Public Class Client Private _ID As String Private _Name As String _ Public Property ID() As String Get Return _ID End Get Set(ByVal Value As String) _ID = Value End Set End Property Public Property Name() As String Get Return _Name End Get Set(ByVal Value As String) _Name = Value End Set End Property Public Sub New(ByVal id As String, ByVal name As String) Me.ID = id Me.Name = name End Sub Public Sub New() ' (XML serialization requires the default constructor.) End Sub End Class

Here's the code needed to create a new Order object, serialize the results to an XML document, deserialize the document back to an object, and display some basic order information.

' Create the order. Dim Client As New Client("CMPSO33UL", "CompuStation") Dim Item1 As New Item("2003", "Calculator", Convert.ToDecimal(24.99)) Dim Item2 As New Item("4311", "Laser Printer", Convert.ToDecimal(400.75)) Dim Items() As Item = {Item1, Item2} Dim Order As New Order("2003-04-12-4996", Client, Items) ' Serialize the order to a file. Dim Serializer As New System.Xml.Serialization.XmlSerializer(GetType(Order)) Dim fs As New FileStream("orders.xml", FileMode.Create) Serializer.Serialize(fs, Order) fs.Close() ' Deserialize the order from the file. fs = New FileStream("orders.xml", FileMode.Open) Order = CType(Serializer.Deserialize(fs), Order) fs.Close() ' Display the prices of all items. Dim Item As Item For Each Item In Order.Items Console.WriteLine(Item.Name & ": " & Item.Price.ToString()) Next

  Note

This approach isn't necessarily better than that presented in recipe 6.6. It does require less code and can prevent some types of error. However, it also forces you to give up a layer of abstraction (the custom reading and writing code) that can be used to perform validation, manage multiple versions of the same XML document, or map XML documents to .NET objects that don't match exactly. The approach you use depends on the needs of your application.

Perform an XSL Transform

Problem

You want to transform an XML document into another document using an XSLT stylesheet.

Solution

Use the Transform method of the System.Xml.Xsl.XslTransform class.

Discussion

XSLT (or XSL transforms) is an XML-based language designed to transform one XML document into another document. XSLT can be used to create a new XML document with the same data but arranged in a different structure, or to select a subset of the data in a document. It can also be used to create a different type of structured document. XSLT is commonly used in this manner to format an XML document into an HTML page.

XSLT is a rich language, and creating XSL transforms is beyond the scope of this book. However, you can learn how to create simple XSLT documents by looking at a basic example. Here's a stylesheet that could be used to transform orders.xml into an HTML summary page:

Order for

 
ID Name Price  

Essentially, every XSL stylesheet consists of a set of templates. Each template matches some set of elements in the source document and then describes the contribution that the matched element will make to the resulting document. In order to match the template, the XSLT document uses XPath expressions, as described in recipe 6.5.

The orders.xslt stylesheet contains two template elements (as children of the root stylesheet element). The first template matches the root Order element. When it finds it, it output the tags necessary to start an HTML table with appropriate column headings and inserts some data about the client using the value-of command, which outputs the text result of an XPath expression. In this case, the XPath expressions (Client/@id and Client/Name) match the id attribute and the Name element.

Next, the apply-templates command is used to branch off and perform processing of any contained Item elements. This is required because there might be multiple Item elements. Each Item element is matched using the XPath expression Items/Item. The root Order node isn't specified because Order is the current node. Finally, the initial template writes the tags necessary to end the HTML document.

To apply this XSLT stylesheet in .NET, use the XslTransform class, as shown in the following code. In this case, the code uses the overloaded version of the Transform method that saves the result document directly to disk, although you could receive it as a stream and process it inside your application instead.

Public Module TransformTest Public Sub Main() Dim Transform As New System.Xml.Xsl.XslTransform ' Load the XSL stylesheet. Transform.Load("orders.xslt") ' Transform orders.xml into orders.html using orders.xslt. Transform.Transform("orders.xml", "orders.html") Console.WriteLine("File 'orders.html' written successfully.") Console.ReadLine() End Sub End Module

The final result of this process is the HTML file shown in the following listing. Figure 6-2 shows how this HTML is displayed in a browser.

Figure 6-2: The stylesheet output for orders.xml

 

Order CMPSO33UL for CompuStation

ID Name Price
2003 Calculator 24.99
4311 Laser Printer 400.75

Validate an XML Document Against a Schema

Problem

You want to ensure that an XML document conforms to an XML schema.

Solution

Use XmlValidatingReader and handle the ValidationEventHandler event.

Discussion

An XML schema defines the rules that a given type of XML document must follow. The schema includes rules that define

XML schema documents are beyond the scope of this chapter, but much can be learned from a simple example. Essentially, an XSD document lists the elements that can occur using element tags. The type attribute indicates the data type. Here's an example for the product name:

 

The basic schema data types are defined at http://www.w3.org/TR/xmlschema-2 . They map closely to .NET data types and include string, int, long, decimal, float, dateTime, boolean, and base64Binary, to name a few of the most frequently used types.

Elements that consist of more than one subelement are called complex types. You can nest them together using a sequence tag, if order is important, or a choice tag if it's not. Here's how you might model the Client element:

 

By default, a listed element can occur exactly one time in a document. You can configure this behavior by specifying the maxOccurs and minOccurs attributes:

 

Here's the complete schema for the orders.xml file:

The XmlValidatingReader class enforces all of these schema rules, and it also checks that the XML document is well formed (which means there are no illegal characters, all opening tags have a corresponding closing tag, and so on). To check a document, you read through it one node at a time by calling the XmlValidatingReader.Read method. If an error is found, XmlValidatingReader raises a ValidationEventHandler event with information about the error. If you wish, you can handle this event and continue processing the document to find more errors. If you don't handle this event, an XmlException will be raised when the first error is encountered, and processing will be aborted. To test only if a document is well-formed, you can use the XmlValidatingReader without a schema.

The next example shows a utility class that displays all errors in an XML document when the ValidateXml method is called. Errors are displayed in a Console window, and a final Boolean variable is returned to indicate the success or failure of the entire validation operation. Remember that you'll need to import the System.Xml.Schema namespace in order to use this class.

Public Class ConsoleValidator ' Set to True if at least one error exist. Private Failed As Boolean Public Function ValidateXml(ByVal XmlFilename As String, _ ByVal schemaFilename As String) As Boolean ' Create the validator. Dim r As New XmlTextReader(XmlFilename) Dim Validator As New XmlValidatingReader(r) Validator.ValidationType = ValidationType.Schema Dim Schema As New System.Xml.Schema.XmlSchema ' Load the schema file into the validator. Dim Schemas As New XmlSchemaCollection Schemas.Add(Nothing, schemaFilename) Validator.Schemas.Add(Schemas) ' Set the validation event handler. AddHandler Validator.ValidationEventHandler, _ AddressOf Me.ValidationEventHandler Failed = False Try ' Read all XML data. While Validator.Read() End While Catch Err As XmlException ' This happens if the XML document includes illegal characters ' or tags that aren't properly nested or closed. Console.WriteLine("A critical XML error has occured.") Failed = True End Try Validator.Close() Return Not Failed End Function Private Sub ValidationEventHandler(ByVal sender As Object, _ ByVal args As System.Xml.Schema.ValidationEventArgs) Failed = True ' Display the validation error. Console.WriteLine("Validation error: " & args.Message) End Sub End Class

Here's how you would use the class:

Dim ConsoleValidator As New ConsoleValidator Console.WriteLine("Validating XML file orders.xml with orders.xsd.") Dim Success As Boolean Success = ConsoleValidator.ValidateXml("orders.xml", "orders.xsd")

If the document is valid, no messages will appear, and the Success variable will be set to True. But consider what happens if you use a document that breaks schema rules, like the orders_wrong.xml file shown here:

CompuStation Calculator twenty-four 400.75 Laser Printer

If you attempt to validate this document, the output will indicate each error, and the Success variable will be set to False:

Validation error: Element 'Client' has invalid child element 'Namely'. Expected 'Name'. Validation error: The 'Namely' element is not declared. Validation error: The 'Price' element has an invalid value according to its data type. Validation error: Element 'Item' has invalid child element 'Price'. Expected 'Name'.

If you want to validate an XML document and then process it, you can use XmlValidatingReader to scan a document as it's read into an in-memory XmlDocument. Here's how it works:

Dim Doc As New XmlDocument() Dim r As New XmlTextReader("orders.xml") Dim Validator As New XmlValidatingReader(r) ' Load the schema into the validator. Validator.ValidationType = ValidationType.Schema Dim Schema As New System.Xml.Schema.XmlSchema() Dim Schemas As New XmlSchemaCollection() Schemas.Add(Nothing, "......orders.xsd") Validator.Schemas.Add(Schemas) ' Load the document and validate it at the same time. ' Don't handle the ValidationEventHandler event. Instead, allow any errors ' to be thrown as an XmlSchemaException. Try Doc.Load(Validator) ' (Validation succeeded if you reach here.) Catch Err As XmlSchemaException ' (Validation failed if you reach here.) End Try

  Note

Microsoft Visual Studio .NET includes a visual schema designer that allows you to create schema files at design-time using graphical elements. You can also use the command-line utility xsd.exe to quickly create a schema from an XML document, which you can use as a starting point.

Store Binary Data with a Base64 Transform

Problem

You need to store binary data in an XML file.

Solution

Use Convert.ToBase64String to create a string representation of the data that will not contain any illegal characters.

Discussion

XML documents can't contain extended characters, or special characters such as the greater than (>) or less than (<) symbols, which are used to denote elements. However, you can convert binary data into a string representation that is XML-legal by using a Base64 transform.

In Base64 encoding, each sequence of three bytes is converted to a sequence of four bytes. Each Base64 encoded character has one of the 64 possible values in the range {A-Z, a-z, 0-9, +, /, =}.

Here's an example that creates a new node in the orders.xml for Base64-encoded image data. In order to use this code as written, you must import the System.IO namespace.

Public Module StoreBase64Data Public Sub Main() ' Load the document. Dim Doc As New XmlDocument Doc.Load("orders.xml") ' Create a new element. Dim LogoNode As XmlNode LogoNode = Doc.CreateElement("Logo") ' Retrieve the picture data. Dim fs As New FileStream("logo.bmp", FileMode.Open) Dim LogoBytes(Convert.ToInt32(fs.Length)) As Byte fs.Read(LogoBytes, 0, LogoBytes.Length) ' Encode the picture data and add it as text. Dim EncodedText As String = Convert.ToBase64String(LogoBytes) LogoNode.AppendChild(Doc.CreateTextNode(EncodedText)) ' Add the new element. Doc.DocumentElement.ChildNodes(0).AppendChild(LogoNode) ' Save the document. Doc.Save("orders_pic.xml") Console.WriteLine("File successfully 'orders_pic.xml' written.") Console.ReadLine() End Sub End Module

Here's the resulting (slightly abbreviated) XML document:

CompuStation R0lGODlh0wAfALMPAAAAAIAAAACAAICAAAAAgIAAgACAgICAgMDAwP8AAD...

You can use Convert.FromBase64String to retrieve the image data from the XML document.

  Note

Visual Studio .NET uses a Base64 transform to store binary information that's added to a form at design time in the corresponding XML resources file.

Категории