Hack 88. Test Two Values from a Subquery
Testing against a single-column subquery is straightforward. Testing against two columns is harder, but there are still plenty of options.
Suppose you have a table of customers and their orders, as shown in Table 10-15.
Customer | Item | Price |
---|---|---|
Brian | Table | 100 |
Robert | Chair | 20 |
Robert | Carpet | 200 |
Janette | Statue | 300 |
You want to produce a list of every customer and their biggest order, as shown in Table 10-16.
Customer | Item | Price |
---|---|---|
Brian | Table | 100 |
Robert | Carpet | 200 |
Janette | Statue | 300 |
Spotting the highest price per customer is easy. You can use:
mysql> SELECT Customer, MAX(Price) -> FROM custItem -> GROUP BY Customer; +----------+------------+ | Customer | MAX(Price) | +----------+------------+ | Brian | 100 | | Janette | 300 | | Robert | 200 | +----------+------------+
If you want to get back the item or items associated with that price, you can test the (Customer, Price) pair:
mysql> SELECT Customer,Item,Price -> FROM custItem -> WHERE (Customer,Price) IN ( -> SELECT Customer,MAX(Price) FROM custItem -> GROUP BY Customer -> ); +----------+--------+-------+ | Customer | Item | Price | +----------+--------+-------+ | Brian | Table | 100 | | Robert | Carpet | 200 | | Janette | Statue | 300 | +----------+--------+-------+
That works well in MySQL, PostgreSQL, and Oracle, but not in SQL Server.
You can use a correlated subquery in any database:
mysql> SELECT Customer,Item,Price -> FROM custItem cout -> WHERE Price IN ( -> SELECT MAX(Price) FROM custItem cin -> WHERE cin.Customer = cout.Customer -> ); +----------+--------+-------+ | Customer | Item | Price | +----------+--------+-------+ | Brian | Table | 100 | | Robert | Carpet | 200 | | Janette | Statue | 300 | +----------+--------+-------+
You need to refer to the custItem in the outer query from within the subquery. This type of query can be hard for a database system to optimize.
You could use a JOIN of the original table with the maximum prices. It should be possible for the optimizer to make a good job of this:
mysql> SELECT x.Customer,x.Item,x.Price -> FROM custItem x JOIN ( -> SELECT Customer, MAX(Price) AS Price -> FROM custItem -> GROUP BY Customer) y -> ON (x.Customer = y.Customer AND x.Price = y.Price); +----------+--------+-------+ | Customer | Item | Price | +----------+--------+-------+ | Brian | Table | 100 | | Robert | Carpet | 200 | | Janette | Statue | 300 | +----------+--------+-------+
Instead, you can make a subquery return as many columns as you want, as long as you concatenate all the column data together. When you do the concatenation, you should also take care that the meaning of the different columns does not become confused (perhaps by using a separator in the concatenation). As long as the custItem example has no customer names that have numbers in them, no confusion is likely, and thus the query becomes:
mysql> SELECT Customer,Item,Price -> FROM custItem -> WHERE CONCAT(Customer,Price) IN ( -> SELECT CONCAT(Customer,MAX(Price)) FROM custItem -> GROUP BY Customer -> ); +----------+--------+-------+ | Customer | Item | Price | +----------+--------+-------+ | Brian | Table | 100 | | Robert | Carpet | 200 | | Janette | Statue | 300 | +----------+--------+-------+