Producing Master-Detail Lists and Summaries

12.9.1 Problem

Two related tables have a master-detail relationship and you want to produce a list that shows each master record with its detail records, or a list that summarizes the detail records for each master record.

12.9.2 Solution

The solution to this problem involves a join, but the type of join depends on the question you want answered. To produce a list containing only master records for which some detail record exists, use a regular join based on the primary key in the master table. To produce a list that includes entries for all master records, even those that have no detail records, use a LEFT JOIN.

12.9.3 Discussion

It's often useful to produce a list from two related tables. For tables that have a master-detail or parent-child relationship, a given record in one table might be matched by several records in the other. This section shows some questions of this type that you can ask (and answer), using the artist and painting tables from earlier in the chapter.

One form of master-detail question for these tables is, "Which artist painted each painting?" This is a simple join that matches each painting record to its corresponding artist record based on the artist ID values:

mysql> SELECT artist.name, painting.title -> FROM artist, painting WHERE artist.a_id = painting.a_id -> ORDER BY 1, 2; +----------+-------------------+ | name | title | +----------+-------------------+ | Da Vinci | The Last Supper | | Da Vinci | The Mona Lisa | | Renoir | Les Deux Soeurs | | Van Gogh | Starry Night | | Van Gogh | The Potato Eaters | | Van Gogh | The Rocks | +----------+-------------------+

That type of join suffices, as long as you want to list only master records that have detail records. However, another form of master-detail question you can ask is, "Which paintings did each artist paint?" That question is similar, but not quite identical. It will have a different answer if there are artists listed in the artist table that are not represented in the painting table, and the question requires a different query to produce the proper answer. In that case, the join output should include records in one table that have no match in the other. That's a form of "find the non-matching records" problem (Recipe 12.6), so to list each artist record, whether or not there are any painting records for it, use a LEFT JOIN:

mysql> SELECT artist.name, painting.title -> FROM artist LEFT JOIN painting ON artist.a_id = painting.a_id -> ORDER BY 1, 2; +----------+-------------------+ | name | title | +----------+-------------------+ | Da Vinci | The Last Supper | | Da Vinci | The Mona Lisa | | Monet | NULL | | Picasso | NULL | | Renoir | Les Deux Soeurs | | Van Gogh | Starry Night | | Van Gogh | The Potato Eaters | | Van Gogh | The Rocks | +----------+-------------------+

The rows in the result that have NULL in the title column correspond to artists that are listed in the artist table for whom you have no paintings.

The same principles apply when producing summaries using master and detail tables. For example, to summarize your art collection by number of paintings per painter, you might ask, "how many paintings are there per artist in the painting table?" To find the answer based on artist ID, you can count up the paintings easily with this query:

mysql> SELECT a_id, COUNT(a_id) AS count FROM painting GROUP BY a_id; +------+-------+ | a_id | count | +------+-------+ | 1 | 2 | | 3 | 3 | | 5 | 1 | +------+-------+

Of course, that output is essentially meaningless unless you have all the artist ID numbers memorized. To display the artists by name rather than ID, join the painting table to the artist table:

mysql> SELECT artist.name AS painter, COUNT(painting.a_id) AS count -> FROM artist, painting -> WHERE artist.a_id = painting.a_id -> GROUP BY artist.name; +----------+-------+ | painter | count | +----------+-------+ | Da Vinci | 2 | | Renoir | 1 | | Van Gogh | 3 | +----------+-------+

On the other hand, you might ask, "How many paintings did each artist paint?" This is the same question as the previous one (and the same query answers it), as long as every artist in the artist table has at least one corresponding painting table record. But if you have artists in the artist table that are not yet represented by any paintings in your collection, they will not appear in the query output. To produce a count-per-artist summary that includes even artists with no paintings in the painting table, use a LEFT JOIN:

mysql> SELECT artist.name AS painter, COUNT(painting.a_id) AS count -> FROM artist LEFT JOIN painting ON artist.a_id = painting.a_id -> GROUP BY artist.name; +----------+-------+ | painter | count | +----------+-------+ | Da Vinci | 2 | | Monet | 0 | | Picasso | 0 | | Renoir | 1 | | Van Gogh | 3 | +----------+-------+

Beware of a subtle error that is easy to make when writing that kind of query. Suppose you write it slightly differently, like so:

mysql> SELECT artist.name AS painter, COUNT(*) AS count -> FROM artist LEFT JOIN painting ON artist.a_id = painting.a_id -> GROUP BY artist.name; +----------+-------+ | painter | count | +----------+-------+ | Da Vinci | 2 | | Monet | 1 | | Picasso | 1 | | Renoir | 1 | | Van Gogh | 3 | +----------+-------+

Now every artist appears to have at least one painting. Why the difference? The cause of the problem is that the query uses COUNT(*) rather than COUNT(painting.a_id). The way LEFT JOIN works for unmatched rows in the left table is that it generates a row with all the columns from the right table set to NULL. In the example, the right table is painting. The query that uses COUNT(painting.a_id) works correctly, because COUNT(expr) doesn't count NULL values. The query that uses COUNT(*) works incorrectly because it counts all values, even for rows corresponding to missing artists.

LEFT JOIN is suitable for other types of summaries as well. To produce additional columns showing the total and average values of the paintings for each artist in the artist table, use this query:

mysql> SELECT artist.name AS painter, -> COUNT(painting.a_id) AS 'number of paintings', -> SUM(painting.price) AS 'total price', -> AVG(painting.price) AS 'average price' -> FROM artist LEFT JOIN painting ON artist.a_id = painting.a_id -> GROUP BY artist.name; +----------+---------------------+-------------+---------------+ | painter | number of paintings | total price | average price | +----------+---------------------+-------------+---------------+ | Da Vinci | 2 | 121 | 60.5000 | | Monet | 0 | 0 | NULL | | Picasso | 0 | 0 | NULL | | Renoir | 1 | 64 | 64.0000 | | Van Gogh | 3 | 148 | 49.3333 | +----------+---------------------+-------------+---------------+

Note that COUNT( ) and SUM( ) are zero for artists that are not represented, but AVG( ) is NULL. That's because AVG( ) is computed as the sum over the count; if the count is zero, the value is undefined. To display an average value of zero in that case, modify the query to test the value of AVG( ) with IFNULL( ):

mysql> SELECT artist.name AS painter, -> COUNT(painting.a_id) AS 'number of paintings', -> SUM(painting.price) AS 'total price', -> IFNULL(AVG(painting.price),0) AS 'average price' -> FROM artist LEFT JOIN painting ON artist.a_id = painting.a_id -> GROUP BY artist.name; +----------+---------------------+-------------+---------------+ | painter | number of paintings | total price | average price | +----------+---------------------+-------------+---------------+ | Da Vinci | 2 | 121 | 60.5000 | | Monet | 0 | 0 | 0 | | Picasso | 0 | 0 | 0 | | Renoir | 1 | 64 | 64.0000 | | Van Gogh | 3 | 148 | 49.3333 | +----------+---------------------+-------------+---------------+

Категории