Beginning XML Databases (Wrox Beginning Guides)
| ||
| ||
|
The XQuery FLWOR statement is a form of a for loop. In programming parlance, a for loop allows repetitive processing of all the items in a collection of items. In other words, a for loop allows the same processing to be executed against every item in a collection.
FLWOR stands for For, Let, Where, Order By, and Return. What this means is that the for loop allows you to loop through multiple collection elements from one element to another. The where clause allows filtering of items to be used in the collection. So, you dont have to process all items in the collection. The Order By clause sorts the resulting collection of items sent back by the return clause. The return clause dictates specific data items that should be returned from a for loop.
This is a doc function in XPath 2.0, which finds all cities with a population of over 10 million people:
doc("demographics.xml")/demographics/region/country/city[population>10000000]
An XQuery equivalent of the preceding doc function would be as follows (you have seen this command previously):
for $i in doc("demographics.xml")/demographics/region/country/city where $i/population>10000000 order by $i/name return $i/name
FLWOR: The Basic for Loop and Return Clause
Thats a bit of a complex for loop to begin with so lets retrogress somewhat and go through each part of a FLWOR expression in detail. Heres a very simple FLWOR expression comprising just a loop, an expression determining values in the collection looped through, and the collection returned:
xquery version "1.0"; <HTML><HEAD><TITLE>Demographics</TITLE></HEAD><BODY> <OL>{ for $n in //language[@name='Swazi'] return <LI>Population is { string($n/male) } males</LI> }</OL></BODY></HTML>
The boldfaced parts in the preceding script are XQuery code; everything else is HTML code used for beautification. The result is shown in Figure 11-6.
At this point I will stop using XQuery commands embedded into HTML scripts for the sake of simplicity with Saxon. Any subsequent use of HTML embedding later in this chapter is merely to beautify, and is not for any specific technical reason. This following script removes the HTML tags:
for $n in //language[@name='Swazi'] return $n/male
Figure 11-7 shows the Saxon shell with the result of executing the preceding script.
The return clause is used to return either whatever the for loop has found in its entirety, or a subset part of the for loop results. In the following example, the for loop finds all country nodes (including all child elements) within all regions . The return clause filters out the for loop results, returning only language nodes within all countries found by the for loop:
for $n in //region/country return $n//year/languages/language
Figure 11-8 shows a partial result, containing language nodes, including child elements within each language element.
I can even add a root node and other nodes to the result shown in Figure 11-8, both within and outside of the return clause:
<lotsoflanguages> { for $n in //region/country return <country> { $n//year/languages/language } </country> } </lotsoflanguages>
Curly braces are used to embed the XQuery code within XML tags.
Figure 11-9 shows a partial result with a root node added called lotsoflanguages.
FLWOR: Adding a where Clause
Now I will add the where clause to the previous for loop. As you see in the text that follows, all I do is move the predicate from the XPath expression to the where clause. The script that follows is the same example used in Figure 11-6, except in this case the for clause predicate has been shifted into a where clause:
xquery version "1.0"; <HTML><HEAD><TITLE>Demographics</TITLE></HEAD><BODY> <OL>{ for $n in //language where $n/@name='Swazi' return <LI>Population is { string($n/male) } males</LI> }</OL></BODY></HTML>
As you can see in Figure 11-10, using the where clause does not change the result from that of Figure 11-6.
Embedded HTML has been used for Figure 11-10 because the result is identical to that of Figure 11-6, which is an HTML-coded example.
As a final note, the where clause can use operators such as and and or in order to test multiple conditions, as in the following pseudocode examples:
where $x=1 and $y=2
and:
where $x=1 or $y=2
FLWOR: Adding an Order By Clause
The order by clause will sort the resulting collection of items sent back by the return clause. The order by will sort after the completion of the read performed by the for loop, not during the read I/O process. In other words, the order by clause resorts the result, but does not change the order in which the for loop accesses collection items.
Lets use a previously used example from Figure 11-3, which already has an order by clause:
for $n in //city[population>0] order by $n/number(population) descending return <city> { concat(string($n/name), string($n/population)) } </city>
The preceding order by clause is descending from the highest population to the lowest . If nothing is specified then the default will be an ascending sort, but ascending can be specified if you want. Then execute the preceding script using this command:
Query -s demographics.xml fig1111.xquery
The result appears in Figure 11-11, which shows the result sorted in order of language, unlike Figure 11-9, which was not sorted at all.
You can also sort on multiple values. The following cities are sorted in ascending alphabetical order with decreasing populations:
for $n in //region/country/city order by round(number($n/population div 1000000)) descending, $n/name ascending return <city> { concat( string(round(number($n/population div 1000000))) ,string(' ') ,string($n/name) ) } </city>
In this example, city populations are divided into millions. As you can see in Figure 11-12, dividing populations into millions ensures that population values are repeated, thus showing sorting of city names within those repeated population numbers .
FLWOR: Declaring Variables with the Let Clause
The let clause will allow you to create a variable and assign a value to it:
let $maxLimit := 20000000 let $highDensity := //city[population > $maxLimit] return $highDensity/name
Obviously a variable can be used later on in the XQuery FLWOR command but the result shown in Figure 11-13 describes all cities with a population of over 20,000,000 inhabitants.
FLWOR: Embedded for Loops and Communication
Multiple for loops can be embedded, one within another, simply by returning the result of one for loop, up into a calling for loop. Obviously, the result returned by an embedded for loop must be a properly formatted XML tree (including a root node). One important point to note is that you can pass the current value of the calling for loop into any child for loops, effectively allowing a one-to-many join. The following example finds all countries within each respective region:
<regions> { for $n in //region return <region name="{$n/name}"> { for $m in $n/country return <country name="{$m/name}"></country> } </region> } </regions>
The result of the preceding script is shown in Figure 11-14.
The following example is an ugly kind of cross-join that I have seen mentioned in many places. Cross-joins are not really efficient, and are rarely used except in data warehouses for very large reports on even bigger amounts of data. XML document sizes are unlikely to get to data warehouse sizes any time soon. However, you can follow one for loop by another (without embedding), as follows:
for $n in ("A", "B") for $m in ("i", "ii", "iii", "iv") return concat($n, $m)
Syntactically you can achieve the same thing with the following XQuery code:
for $n in ("A", "B", "C"), $m in ("i", "ii", "iii", "iv") return concat($n, $m)
A for loop does not have to read a collection from an XML document; it can also read literal values, as shown in the preceding code.
Figure 11-15 shows the result returned by the preceding two queries.
A let clause can also use literal values as a for loop can. In the following example, the let clause iterates between two values:
let $n := (1 to 3) return <counter> { $n } </counter>
The result looks like this:
<?xml version="1.0" encoding="UTF-8"?> <counter>1 2 3</counter>
| ||
| ||
|