Referring to Join Output Column Names in Programs

2017-11-03 09:05:05

12.4.1 Problem

You need to process the result of a join query from within a program, but the column names in the result set aren't unique.

12.4.2 Solution

Use column aliases to assign unique names to each column, or refer to the columns by position.

12.4.3 Discussion

Joins often retrieve columns from similar tables, and it's not unusual for columns selected from different tables to have the same names. Consider again the three-way join between the shirt, tie, and pants tables that was used in Recipe 12.2:

The query uses the table names to qualify each instance of item in the output column list to clarify which table each item comes from. But the column names in the output are not distinct, because MySQL doesn't include table names in the column headings. If you're processing the result of the join from within a program and fetching rows into a data structure that references column values by name, non-unique column names can cause some values to become inaccessible. The following Perl script fragment illustrates the difficulty:

$stmt = qq{ SELECT shirt.item, tie.item, pants.item FROM shirt, tie, pants }; $sth = $dbh->prepare ($stmt); $sth->execute ( ); # Determine the number of columns in result set rows two ways: # - Check the NUM_OF_FIELDS statement handle attribute # - Fetch a row into a hash and see how many keys the hash contains $count1 = $sth->{NUM_OF_FIELDS}; $ref = $sth->fetchrow_hashref ( ); $count2 = keys (%{$ref}); print "The statement is: $stmt "; print "According to NUM_OF_FIELDS, the result set has $count1 columns "; print "The column names are: " . join (",", sort (@{$sth->{NAME}})) . " "; print "According to the row hash size, the result set has $count2 columns "; print "The column names are: " . join (",", sort (keys (%{$ref}))) . " ";

The script issues the wardrobe-selection query, then determines the number of columns in the result, first by checking the NUM_OF_FIELDS attribute, then by fetching a row into a hash and counting the number of hash keys. Executing this script results in the following output:

According to NUM_OF_FIELDS, the result set has 3 columns The column names are: item,item,item According to the row hash size, the result set has 1 columns The column names are: item

There is a problem herethe column counts don't match. The second count is 1 because the non-unique column names cause multiple column values to be mapped onto the same hash element. As a result of these hash key collisions, some of the values are lost. To solve this problem, make the column names unique by supplying aliases. For example, the query can be rewritten from:

SELECT shirt.item, tie.item, pants.item FROM shirt, tie, pants

to:

SELECT shirt.item AS shirt, tie.item AS tie, pants.item AS pants FROM shirt, tie, pants

If you make that change and rerun the script, its output becomes:

According to NUM_OF_FIELDS, the result set has 3 columns The column names are: pants,shirt,tie According to the row hash size, the result set has 3 columns The column names are: pants,shirt,tie

Now the column counts are the same; no values are lost when fetching into a hash.

Another way to address the problem that doesn't require renaming the columns is to fetch the row into something other than a hash. For example, you can fetch the row into an array and refer to the shirt, tie, and pants items as the first through third elements of the array:

while (my @val = $sth->fetchrow_array ( )) { print "shirt: $val[0], tie: $val[1], pants: $val[2] "; }

The name-clash problem may have different solutions in other languages. For example, the problem doesn't occur in quite the same way in Python scripts. If you retrieve a row using a dictionary (Python's closest analog to a Perl hash), the MySQLdb module notices clashing column names and places them in the dictionary using a key consisting of the column name with the table name prepended. Thus, for the following query, the dictionary keys would be item, tie.item, and pants.item:

SELECT shirt.item, tie.item, pants.item FROM shirt, tie, pants

That means column values won't get lost, but it's still necessary to be aware of non-unique names. If you try to refer to column values using just the column names, you won't get the results you expect for those names that are reported with a leading table name. If you use aliases to make each column name unique, the dictionary entries will have the names that you assign.

Категории