Introducing MicrosoftВ® LINQ

Now consider XML nodes from a different point of view: every node set can be thought of as a sequence of nodes and queried by using LINQ queries, just as with any other sequence of type IEnumerable<T>. Starting from this postulate, we argue that every concept we have already seen applied to other sequences in the fields of LINQ queries (such as LINQ to Objects, LINQ to Entities, and so forth) can also be leveraged with XML nodes, because LINQ to XML exposes every collection of nodes as an IEnumerable<T> instance.

For example, we can use the standard query extension methods, already described in Chapter 4, to query XML nodes, too. There are also custom extension methods, specifically defined to be applied to sequences of IEnumerable<X*>, declared in the System.Xml.Linq.Extensions class. In this section, we will cover all these methods.

Attribute, Attributes

Each instance of XElement supports a set of methods to access its attributes, as shown here:

public XAttribute Attribute(XName name); public IEnumerable<XAttribute> Attributes(); public IEnumerable<XAttribute> Attributes(XName name);

As you can see, the first method returns a single XAttribute instance that is retrieved by name if it exists. If it does not exist, the method returns NULL. The second method returns a sequence of attributes of type IEnumerable<XAttribute>, which are useful for LINQ queries, containing all the attributes of an XElement instance. The last method shown returns a sequence of type IEnumerable<XAttribute> that contains zero or one items. Attributes of one element are a collection of unique named nodes; therefore, an element with multiple occurrences of the same attribute name cannot exist.

Element, Elements

Every XContainer instance provides methods to select single elements by name or to select sequences of elements that are eventually filtered by their name (of type XName). Here are their signatures:

public XElement Element(XName name); public IEnumerable<XElement> Elements(); public IEnumerable<XElement> Elements(XName name);

The Element method iterates over the child nodes of the current XContainer and returns the first XElement, whose name corresponds to the argument of type XName provided. Because of the argument type (XName), you have to provide a valid node name, with its XML namespace URI in a case in which you are looking for a qualified element, as shown in Listing 6-22.

Listing 6-22: A sample LINQ to XML query based on the Element extension method

XNamespace ns = "http://schemas.devleap.com/Customers"; XElement xmlCustomers = new XElement(ns + "customers", from c in customers where c.Country == Countries.Italy select new XElement(ns + "customer", new XAttribute("name", c.Name), new XAttribute("city", c.City), new XAttribute("country", c.Country))); XElement element = xmlCustomers.Element(ns + "customer");

To get all the customers, we can use the Elements method, as shown in Listing 6-23.

Listing 6-23: Another sample LINQ to XML query based on the Elements extension method

var elements = xmlCustomers.Elements(); foreach (XElement e in elements) { Console.WriteLine(e); }

Here is the result:

<customer name="Paolo" city="Brescia" country="Italy" /> <customer name="Marco" city="Torino" country="Italy" />

The last overload of the Elements method just allows filtering child elements by name. There is no way, using the Element or Elements method, to get a single XElement child of the current XContainer without providing a filtering name, given that there are more than one child elements. However, you can leverage the First extension method of LINQ to Objects to achieve this goal. Here is an example:

XElement firstElement = xmlCustomers.Elements().First();

Let’s try to leverage what we have just learned with LINQ queries. Imagine that you need to transform a source document into a new schema. Listing 6-24 shows the source document.

Listing 6-24: Source XML with a list of customers

<?xml version="1.0" encoding="utf-8"?> <customers> <customer name="Paolo" city="Brescia" country="Italy" /> <customer name="Marco" city="Torino" country="Italy" /> <customer name="James" city="Dallas" country="USA" /> <customer name="Frank" city="Seattle" country="USA" /> </customers>

And Listing 6-25 shows the desired output, where we changed the namespace of elements and filtered customer elements on a country value basis.

Listing 6-25: Destination XML with a list of customers transformed

<?xml version="1.0" encoding="utf-8"?> <c:customers xmlns:c="http://schemas.devleap.com/Customers"> <c:customer> <c:name>Paolo</c:name> <c:city>Brescia</c:city> </c:customer> <c:customer> <c:name>Marco</c:name> <c:city>Torino</c:city> </c:customer> </c:customers>

We could use XSLT code to transform the source into the output. Listing 6-26 provides really simple XSLT to do that.

Listing 6-26: XSLT to transform XML from Listing 6-24 to Listing 6-25

<?xml version="1.0" encoding="UTF-8" ?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:c="http://schemas.devleap.com/Customers"> <xsl:template match="customers"> <c:customers> <xsl:for-each select="customer[@country = 'Italy']"> <c:customer> <c:name><xsl:value-of select="@name"/></c:name> <c:city><xsl:value-of select="@city"/></c:city> </c:customer> </xsl:for-each> </c:customers> </xsl:template> </xsl:stylesheet>

Nevertheless, if we are already in .NET code, we can avoid exiting from our code context and instead use a simple LINQ query like the one in Listing 6-27.

Listing 6-27: A functional construction used to transform XML from Listing 6-24 to Listing 6-25

XNamespace ns = "http://schemas.devleap.com/Customers"; XElement destinationXmlCustomers = new XElement(ns + "customers", new XAttribute(XNamespace.Xmlns + "c", ns), from c in sourceXmlCustomers.Elements("customer") where c.Attribute("country").Value == "Italy" select new XElement(ns + "customer", new XElement(ns + "name", c.Attribute("name")), new XElement(ns + "city", c.Attribute("city"))));

We personally like and appreciate XSLT features and their strong syntax, but using them requires learning another query language. We know and clearly understand that many developers are not familiar with XSLT syntax and probably will prefer the LINQ solution, which is easier for a .NET developer to write and also typed and checked from a compiler point of view. Finally, you can consider the Visual Basic 9.0 version of this code, shown in Listing 6-28.

Listing 6-28: A Visual Basic 9.0 XML literal used to transform XML from Listing 6-24 to Listing 6-25

Dim destinationXmlCustomers = _ <c:customers xmlns:c="http://schemas.devleap.com/Customers"> <%= From c In sourceXmlCustomers.<customers>.<customer> _ Where (c.@country = "Italy") _ Select _ <c:customer xmlns:c="http://schemas.devleap.com/Customers"> <c:name><%= c.@name %></c:name> <c:city><%= c.@city %></c:city> </c:customer> %> </c:customers>

This approach is probably the one that is the quickest to write and easiest to understand because you can directly think about the output XML. We can make it even easier by using global XML namespaces. It is important to notice the syntax used to select elements and attributes from the source XML document. We use a special Visual Basic 9.0 syntax that you already saw in Chapter 3. The syntax recalls XPath node selection. As you can see, we select all the element nodes named customer, which are children of the customers element within the sourceXmlCustomer, by using the following syntax:

sourceXmlCustomers.<customers>.<customer>

The Visual Basic 9.0 compiler, as with XML literals, converts the syntax into a standard LINQ to XML invocation of Elements methods. In the same way, the syntax used to select attributes named name and city (c.@name and c.@city) recalls XPath attribute selection rules and is converted into calls of the Attribute method of the XElement type.

Sometimes XML schemas support optional elements or optional attributes. When we define transformations using LINQ to XML, we work at a higher level and use object instances rather than nodes. In cases where we define an XElement-using functional construction-and assign it a NULL value, the result is an empty closed element, like the one shown in the following example:

// Where c.City == null XElement city = new XElement("customer", new XAttribute("id", c.IdCustomer), new XElement("city", c.City));

The result is an empty tag: <city />, as shown here:

<customer ><city /></customer>

In cases where we need to omit the element declaration when it is empty (NULL), we can use the conditional operator, as shown in the following sample:

// Where c.City == null XElement city = new XElement("customer", new XAttribute("id", c.IdCustomer), c.City != null ? new XElement("city", c.City), null);

Whenever we add NULL content to an XContainer, it is skipped without throwing any kind of exception.

XNode Selection Methods

The XNode class provides some methods that are useful for selecting elements and nodes related to the current node itself. For instance, the ElementsBeforeSelf and ElementsAfterSelf methods both return a sequence of type IEnumerable<XElement> that contains the elements before or after the current node, respectively. They both provide an overload with a parameter of type XName to filter elements by name.

In addition, NodesBeforeSelf and NodesAfterSelf methods return a sequence of type IEnumerable<XNode> that contains all the nodes, regardless of their node type, before or after the current one.

Similarities Between XPath Axes and Extension Methods

Extension methods are defined in the System.Xml.Linq.Extensions class that recall XPath Axes functions. The first two methods that we will consider are Ancestors and Descendants, which return an IEnumerable<XElement> sequence of elements for a particular XNode instance. Descendants returns all the elements after the current node in the document graph, regardless of their depth in the graph. Ancestors is somehow complementary to Descendants and returns all the elements before the current node in the document graph. Both are shown here:

public static IEnumerable<XElement> Ancestors<T>( this IEnumerable<T> source) where T: XNode; public static IEnumerable<XElement> Ancestors<T>( this IEnumerable<T> source, XName name) where T: XNode; public static IEnumerable<XElement> Descendants<T>( this IEnumerable<T> source) where T: XContainer; public static IEnumerable<XElement> Descendants<T>( this IEnumerable<T> source, XName name) where T: XContainer;

These methods are useful for querying an XML source to find a particular element after or before the current one, regardless of its position in the graph. Consider the XML document in Listing 6-29.

Listing 6-29: An XML instance to search with LINQ to XML

<?xml version="1.0" encoding="ibm850"?> <customers> <customer> <name>Paolo</name> <city>Brescia</city> <country>Italy</country> </customer> <customer> <name>Marco</name> <city>Torino</city> <country>Italy</country> </customer> </customers>

The following line of code returns 8 as the number of descendant elements of an XML document like the one in Listing 6-29:

Console.WriteLine(xmlCustomers.Descendants().Count());

The descendant elements are as follows: two <customer /> elements, two <name /> elements, two <city /> elements, and two <country /> elements.

Two other extension methods that work like the previous ones are AncestorsAndSelf and DescendantAndSelf. They both act like the previously seen methods but also return the current element. As it happens with XPath Axes, we can retrieve all the elements of an XML source just by specifying the union of the results of Ancestors and DescendantsAndSelf or AncestorsAndSelf and Descendants.

If you need to select all the descendant nodes rather than only the elements, you can use methods such as DescendantNodes of XContainer or DescendantNodesAndSelf of XElement, which return all descendant nodes regardless of their node types, eventually with the node itself for the DescendantNodesAndSelf method. There is also a Nodes extension method, which returns all child nodes of one XContainer, again regardless of their node types.

InDocumentOrder

One last extension method that needs to be explained is the InDocumentOrder method. It orders an IEnumerable<XNode> sequence of nodes related to the same XDocument using the previously seen XNodeDocumentOrderComparer class, which bases its behavior on the CompareDocumentOrder method. This extension method is very useful whenever you want to select nodes ordered on the basis of their order of occurrence in a document.

In the following example, you can see how to use it:

foreach (XNode a in xmlCustomers.DescendantsAndSelf().InDocumentOrder()) { Console.WriteLine(a); }

The result of this sample code is the full list of nodes declared within our xmlCustomers document, ordered by declaration.

Категории