Applying Table Structure Information
9.9.1 Problem
It's all well and good to be able to obtain table structure information, but what can you use it for?
9.9.2 Solution
Lots of things: displaying lists of table columns, creating web form elements, producing ALTER TABLE statements for modifying ENUM or SET columns, and more.
9.9.3 Discussion
This section describes some applications for the table structure information that MySQL provides.
9.9.4 Displaying Column Lists
Probably the simplest use of table information is to present a list of the table's columns. This is common in web-based or GUI applications that allow users to construct queries interactively by selecting a table column from a list and entering a value against which to compare column values. The various versions of the get_column_names_with_show( ) or get_column_names_with_meta( ) functions shown earlier in the chapter can serve as the basis for such list displays.
9.9.5 Interactive Record Editing
Knowledge of a table's structure can be very useful for interactive record-editing applications. Suppose you have an application that retrieves a record from the database, displays a form containing the record's content so a user can edit it, and then updates the record in the database after the user modifies the form and submits it. You can use table structure information for validating column values. For example, if a column is an ENUM, you can find out the valid enumeration values and check the value submitted by the user against them to determine whether or not it's legal. If the column is an integer type, you check the submitted value to make sure that it consists entirely of digits, possibly preceded by a sign character. If the column contains dates, look for a legal date format.
But what if the user leaves a field empty? If the field corresponds to, say, a CHAR column in the table, do you set the column value to NULL or to the empty string? This too is a question that can be answered by checking the table's structure. Determine whether or not the column can contain NULL values. If it can, set the column to NULL; otherwise, set it to the empty string.
9.9.6 Mapping Column Types onto Web Page Elements
Some column types such as ENUM and SET correspond naturally to elements of web forms:
- An ENUM has a fixed set of values from which you choose a single value. This is analogous to a group of radio buttons, a pop-up menu, or a single-pick scrolling list.
- A SET column is similar, except that you can select multiple values; this corresponds to a group of checkboxes or a multiple-pick scrolling list.
If you access the information for these types of columns using SHOW COLUMNS, you can easily determine the legal values for a column and map them onto the appropriate form element automatically. This allows you to present users with a list of applicable values from which selections can be made easily without any typing. Earlier in this chapter we saw how to get ENUM and SET column metadata. The methods developed there are used in Chapter 18, which discusses form generation in more detail.
9.9.7 Adding Elements to ENUM or SET Column Definitions
It's really a pain to add a new element to an ENUM or SET column definition when you use ALTER TABLE, because you have to list not only the new element, but all the existing elements as well. One approach for doing this using mysqldump and an editor is described in Recipe 8.3. Another way to accomplish this task is to write your own program that does most of the work for you by using column metadata. Let's develop a Python script add_element.py that generates the appropriate ALTER TABLE statement automatically when given a table name, an ENUM or SET column name, and the new element value. Suppose you want to add "hot pink" to the colors column of the item table. The current structure of the column looks like this:
mysql> SHOW COLUMNS FROM item LIKE 'colors'G *************************** 1. row *************************** Field: colors Type: set('chartreuse','mauve','lime green','puce') Null: YES Key: Default: puce Extra:
add_element.py will use that information to figure out the correct ALTER TABLE statement and write it out:
% ./add_element.py item colors "hot pink" ALTER TABLE item MODIFY colors set('chartreuse','mauve','lime green','puce','hot pink') NULL DEFAULT 'puce';
By having add_element.py produce the statement as its output, you have the choice of shoving it into mysql for immediate execution or saving the output into a file:
% ./add_element.py item colors "hot pink" | mysql cookbook % ./add_element.py item colors "hot pink" > stmt.sql
You might choose the latter course if you want the new element somewhere other than at the end of the list of values, which is where add_element.py will put it. In this case, edit stmt.sql to place the element where you want it, then execute the statement:
% vi stmt.sql % mysql cookbook < stmt.sql
The first part of the add_element.py script imports the requisite modules and checks the command-line arguments. This is fairly straightforward:
#! /usr/bin/python # add_element.py - show ALTER TABLE statement for ENUM or SET column # (assumes cookbook database) import sys sys.path.insert (0, "/usr/local/apache/lib/python") import re import MySQLdb import Cookbook if len (sys.argv) != 4: print "Usage: add_element.py tbl_name col_name new_element" sys.exit (1) tbl_name = sys.argv[1] col_name = sys.argv[2] elt_val = sys.argv[3]
After connecting to the MySQL server (code not shown), we need to run a SHOW COLUMNS query to retrieve information about the designated column. The following code does this, checking to make sure that the column really exists in the table:
cursor = conn.cursor ( ) # escape SQL pattern characters in column name to match it literally esc_col_name = re.sub (r'([%_])', r'\1', col_name) # this is *not* a use of placeholders cursor.execute ("SHOW COLUMNS FROM %s LIKE '%s'" % (tbl_name, esc_col_name)) info = cursor.fetchone ( ) cursor.close if info == None: print "Could not retrieve information for table %s, column %s" % (tbl_name, col_name) sys.exit (1)
At this point, if the SHOW COLUMNS statement succeeded, the information produced by it is available as a tuple stored in the info variable. We'll need to use several elements from this tuple. The most important is the column type value, which provides the enum(...) or set(...) string containing the column's current definition. We can use this to verify that the column really is an ENUM or SET, then add the new element to the string just before the closing parenthesis. For the colors column, we want to change this:
set('chartreuse','mauve','lime green','puce')
To this:
set('chartreuse','mauve','lime green','puce','hot pink')
It's also necessary to check whether column values can be NULL and what the default value is so that the program can add the appropriate information to the ALTER TABLE statement. The code that does all this is as follows:
# get column type string; make sure it begins with ENUM or SET type = info[1] if not re.match ('(enum|set)', type): print "table %s, column %s is not an ENUM or SET" % (tbl_name, col_name) sys.exit(1) # add quotes, insert comma and new element just before closing paren elt_val = conn.literal (elt_val) type = re.sub (')$', ',' + elt_val + ')', type) # determine whether column can contain NULL values if info[2] == "YES": nullable = "NULL" else: nullable = "NOT NULL"; # construct DEFAULT clause (add surrounding quotes unless # value is NULL) default = "DEFAULT " + conn.literal (info[4]) print "ALTER TABLE %s MODIFY %s %s %s %s;" % (tbl_name, col_name, type, nullable, default)
That's it. You now have a working ENUM- or SET-altering program. Still, add_element.py is fairly basic and could be improved in various ways:
- Make sure that the element value you're adding to the column isn't already there.
- Allow add_element.py to take more than one argument after the column name and add all of them to the column definition at the same time.
- Add an option to indicate that the named element should be deleted rather than added.
- Add an option that tells the script to execute the ALTER TABLE statement immediately rather than displaying it.
- If you have a version of MySQL older than 3.22.16, it won't understand the MODIFY col_name syntax used by add_element.py. You may want to edit the script to use CHANGE col_name syntax instead. The following two statements are equivalent:
ALTER TABLE tbl_name MODIFY col_name col_definition; ALTER TABLE tbl_name CHANGE col_name col_name col_definition;
add_element.py uses MODIFY because it's less confusing than CHANGE.
9.9.8 Retrieving Dates in Non-ISO Format
MySQL stores dates in ISO 8601 format (CCYY-MM-DD), but it's often desirable or necessary to rewrite date values, such as when you need to transfer table data into another program that expects dates in another format. You can write a script that retrieves and prints table rows, using column metadata to detect DATE, DATETIME, and TIMESTAMP columns, and reformat them with DATE_FORMAT( ) into whatever date format you want. (For an example, see Recipe 10.34, which describes a short script named iso_to_us.pl that uses this technique to rewrite ISO dates into U.S. format.)
9.9.9 Converting Character Columns Between Fixed-Length and Variable-Length Types
CHAR columns have a fixed length, whereas VARCHAR columns are variable length. In general, tables that use CHAR columns can be processed more quickly but take up more space than tables that use VARCHAR columns. To make it easier to convert tables to use CHAR or VARCHAR columns, you can use the information provided by SHOW COLUMNS to generate an ALTER TABLE statement that performs the requisite column conversions. Here is a Python function alter_to_char( ) that creates a statement for changing all the VARCHAR columns to CHAR:
def alter_to_char (conn, tbl_name): cursor = conn.cursor ( ) cursor.execute ("SHOW COLUMNS FROM " + tbl_name) rows = cursor.fetchall ( ) cursor.close ( ) str = "" for info in rows: col_name = info[0] type = info[1] if re.match ('varchar', type): # it's a VARCHAR column type = re.sub ("var", "", type) # convert to CHAR # determine whether column can contain NULL values if info[2] == "YES": nullable = "NULL" else: nullable = "NOT NULL"; # construct DEFAULT clause (add surrounding quotes unless # value is NULL) default = "DEFAULT " + conn.literal (info[4]) # add MODIFY clause to string if str != "": str = str + ", " str = str + "MODIFY %s %s %s %s" % (col_name, type, nullable, default) cursor.close ( ) if str == "": return None return "ALTER TABLE " + tbl_name + " " + str
Suppose you have a table that looks like this:
CREATE TABLE chartbl ( c1 VARCHAR(10), c2 VARCHAR(10) BINARY, c3 VARCHAR(10) NOT NULL DEFAULT 'abc'def' );
If you pass the name of that table to the alter_to_varchar( ) function, the statement that it returns looks like this:
ALTER TABLE chartbl MODIFY c1 char(10) NULL DEFAULT NULL, MODIFY c2 char(10) binary NULL DEFAULT NULL, MODIFY c3 char(10) NOT NULL DEFAULT 'abc'def'
A function to convert columns in the other direction (from CHAR to VARCHAR) would be similar. Here is an example, this time in Perl:
sub alter_to_varchar { my ($dbh, $tbl_name) = @_; my ($sth, $str); $sth = $dbh->prepare ("SHOW COLUMNS FROM $tbl_name"); $sth->execute ( ); while (my @row = $sth->fetchrow_array ( )) { if ($row[1] =~ /^char/) # it's a CHAR column { $row[1] = "var" . $row[1]; $str .= ", " if $str; $str .= "MODIFY $row[0] $row[1]"; $str .= ($row[2] eq "YES" ? "" : " NOT") . " NULL"; $str .= " DEFAULT " . $dbh->quote ($row[4]); } } $str = "ALTER TABLE $tbl_name $str" if $str; return ($str); }
For completeness, the function generates an ALTER TABLE statement that explicitly converts all CHAR columns to VARCHAR. In practice, it's necessary to convert only one such column. MySQL notices the change of a column from fixed-length to variable-length format, and automatically converts any other fixed-length columns that have a variable-length equivalent.
9.9.10 Selecting All Except Certain Columns
Sometimes you want to retrieve "almost all" the columns from a table. Suppose you have an image table that contains a BLOB column named data used for storing images that might be very large, and other columns that characterize the BLOB column, such as its ID, a description, and so forth. It's easy to write a SELECT * query that retrieves all the columns, but if all you need is the descriptive information about the images and not the images themselves, it's inefficient to drag the BLOB values over the connection along with the other columns. Instead, you want to select everything in the record except the data column.
Unfortunately, there is no way to say directly in SQL, "select all columns except this one." You must explicitly name all the columns except data. On the other hand, it's easy to construct that kind of query by using table structure information. Extract the list of column names, delete the one to be excluded, then construct a SELECT query from those columns that remain. The following example shows how to do this in PHP, using the get_column_names_with_show( ) function developed earlier in the chapter to obtain the column names from a table:
$names = get_column_names_with_show ($conn_id, $tbl_name); $query = ""; # construct list of columns to select: all but "data" reset ($names); while (list ($index, $name) = each ($names)) { if ($name == "data") continue; if ($query != "") # put commas between column names $query .= ","; $query .= $name; } $query = "SELECT $query FROM $tbl_name";
The equivalent Perl code for constructing the query is a bit shorter (and correspondingly more cryptic):
my @names = get_column_names_with_show ($dbh, $tbl_name); my $query = "SELECT " . join (",", grep (!/^data$/, @names)) . " FROM $tbl_name";
Whichever language you use, the result is a query that you can use to select all columns but data. It will be more efficient than SELECT * because it won't pull the BLOB values over the network. Of course, this process does involve an extra round trip to the server to execute the statement that retrieves the column names, so you should consider the context in which you plan to use the SELECT query. If you're just going to retrieve a single record, it might be more efficient simply to select the entire row than to incur the overhead of the extra round trip. But if you're selecting many rows, the reduction in network traffic achieved by skipping the BLOB columns will be worth the overhead of the additional query for getting table structure.