Finding Values Associated with Minimum and Maximum Values
7.6.1 Problem
You want to know the values for other columns in the row containing the minimum or maximum value.
7.6.2 Solution
Use two queries and a SQL variable. Or use the "MAX-CONCAT trick." Or use a join.
7.6.3 Discussion
MIN( ) and MAX( ) find the endpoints of a range of values, but sometimes when finding a minimum or maximum value, you're also interested in other values from the row in which the value occurs. For example, you can find the largest state population like this:
mysql> SELECT MAX(pop) FROM states; +----------+ | MAX(pop) | +----------+ | 29760021 | +----------+
But that doesn't show you which state has this population. The obvious way to try to get that information is like this:
mysql> SELECT name, MAX(pop) FROM states WHERE pop = MAX(pop); ERROR 1111 at line 1: Invalid use of group function
Probably everyone attempts something like that sooner or later, but it doesn't work, because aggregate functions like MIN( ) and MAX( ) cannot be used in WHERE clauses. The intent of the statement is to determine which record has the maximum population value, then display the associated state name. The problem is that while you and I know perfectly well what we'd mean by writing such a thing, it makes no sense at all to MySQL. The query fails because MySQL uses the WHERE clause to determine which records to select, but it knows the value of an aggregate function only after selecting the records from which the function's value is determined! So, in a sense, the statement is self-contradictory. You could solve this problem using a subselect, except that MySQL won't have those until Version 4.1. Meanwhile, you can use a two-stage approach involving one query that selects the maximum size into a SQL variable, and another that refers to the variable in its WHERE clause:
mysql> SELECT @max := MAX(pop) FROM states; mysql> SELECT @max AS 'highest population', name FROM states WHERE pop = @max; +--------------------+------------+ | highest population | name | +--------------------+------------+ | 29760021 | California | +--------------------+------------+
This technique also works even if the minimum or maximum value itself isn't actually contained in the row, but is only derived from it. If you want to know the length of the shortest verse in the King James Version, that's easy to find:
mysql> SELECT MIN(LENGTH(vtext)) FROM kjv; +--------------------+ | MIN(LENGTH(vtext)) | +--------------------+ | 11 | +--------------------+
If you want to ask "What verse is that?," do this instead:
mysql> SELECT @min := MIN(LENGTH(vtext)) FROM kjv; mysql> SELECT bname, cnum, vnum, vtext FROM kjv WHERE LENGTH(vtext) = @min; +-------+------+------+-------------+ | bname | cnum | vnum | vtext | +-------+------+------+-------------+ | John | 11 | 35 | Jesus wept. | +-------+------+------+-------------+
Another technique you can use for finding values associated with minima or maxima is found in the MySQL Reference Manual, where it's called the "MAX-CONCAT trick." It's pretty gruesome, but can be useful if your version of MySQL precedes the introduction of SQL variables. The technique involves appending a column to the summary column using CONCAT( ), finding the maximum of the resulting values using MAX( ), and extracting the non-summarized part of the value from the result. For example, to find the name of the state with the largest population, you can select the maximum combined value of the pop and name columns, then extract the name part from it. It's easiest to see how this works by proceeding in stages. First, determine the maximum population value to find out how wide it is:
mysql> SELECT MAX(pop) FROM states; +----------+ | MAX(pop) | +----------+ | 29760021 | +----------+
That's eight characters. It's important to know this, because each column within the combined population-plus-name values should occur at a fixed position so that the state name can be extracted reliably later. (By padding the pop column to a length of eight, the name values will all begin at the ninth character.)
However, we must be careful how we pad the populations. The values produced by CONCAT( ) are strings, so the population-plus-name values will be treated as such by MAX( ) for sorting purposes. If we left justify the pop values by padding them on the right with RPAD( ), we'll get combined values like the following:
mysql> SELECT CONCAT(RPAD(pop,8,' '),name) FROM states; +------------------------------+ | CONCAT(RPAD(pop,8,' '),name) | +------------------------------+ | 4040587 Alabama | | 550043 Alaska | | 3665228 Arizona | | 2350725 Arkansas | ...
Those values will sort lexically. That's okay for finding the largest of a set of string values with MAX( ). But pop values are numbers, so we want the values in numeric order. To make the lexical ordering correspond to the numeric ordering, we must right justify the population values by padding on the left with LPAD( ):
mysql> SELECT CONCAT(LPAD(pop,8,' '),name) FROM states; +------------------------------+ | CONCAT(LPAD(pop,8,' '),name) | +------------------------------+ | 4040587Alabama | | 550043Alaska | | 3665228Arizona | | 2350725Arkansas | ...
Next, use the CONCAT( ) expression with MAX( ) to find the value with the largest population part:
mysql> SELECT MAX(CONCAT(LPAD(pop,8,' '),name)) FROM states; +-----------------------------------+ | MAX(CONCAT(LPAD(pop,8,' '),name)) | +-----------------------------------+ | 29760021California | +-----------------------------------+
To obtain the final result (the state name associated with the maximum population), extract from the maximum combined value the substring that begins with the ninth character:
mysql> SELECT SUBSTRING(MAX(CONCAT(LPAD(pop,8,' '),name)),9) FROM states; +------------------------------------------------+ | SUBSTRING(MAX(CONCAT(LPAD(pop,8,' '),name)),9) | +------------------------------------------------+ | California | +------------------------------------------------+
Clearly, using a SQL variable to hold an intermediate result is much easier. In this case, it's also more efficient because it avoids the overhead for concatenating column values for sorting and decomposing the result for display.
Yet another way to select other columns from rows containing a minimum or maximum value is to use a join. Select the value into another table, then join it to the original table to select the row that matches the value. To find the record for the state with the highest population, use a join like this:
mysql> CREATE TEMPORARY TABLE t -> SELECT MAX(pop) as maxpop FROM states; mysql> SELECT states.* FROM states, t WHERE states.pop = t.maxpop; +------------+--------+------------+----------+ | name | abbrev | statehood | pop | +------------+--------+------------+----------+ | California | CA | 1850-09-09 | 29760021 | +------------+--------+------------+----------+
7.6.4 See Also
For more information about joins, see Chapter 12.