Hack 29. Other Ways to COUNT
When you want to count rows you can choose one column as the argument to COUNT. But you don't have to choose any column at all.
Suppose you need to know how many comics you have in your table of performers (see Table 5-7).
name | status | appearance | medium |
---|---|---|---|
Abbott | Foil |
NULL |
TV |
Bing | Foil | Suave | Cinema |
Costello | Comic | Stout | TV |
Groucho | Comic | Mustachioed | Cinema |
Hardy | Foil | Stout | Cinema |
Hope | Comic |
NULL |
Cinema |
Laurel | Comic | Thin | Cinema |
Belushi | Comic | Stout | Cinema |
It makes a difference which column you count. Notice that the count skips the NULL values:
mysql> SELECT COUNT(name), COUNT(status), COUNT(appearance), COUNT(medium) -> FROM performer WHERE status='Comic'; +-------------+---------------+-------------------+---------------+ | COUNT(name) | COUNT(status) | COUNT(appearance) | COUNT(medium) | +-------------+---------------+-------------------+---------------+ | 5 | 5 | 4 | 5 | +-------------+---------------+-------------------+---------------+
You can use COUNT(1) to count all the rows:
mysql> SELECT COUNT(1) FROM performers WHERE status='Comic'; +-----------+ | COUNT(1) | +-----------+ | 5 | +-----------+
This neatly avoids you having to choose a particular field to count. Of course, you could use COUNT(*), which means the same thing.
5.6.1. Counting with a Condition
Instead of using COUNT, you can use SUM on a condition. In place of the SELECT/WHERE statement you can use SUM with CASE:
mysql> SELECT SUM(CASE WHEN status='Comic' THEN 1 END) FROM performer; +------------------------------------------+ | SUM(CASE WHEN status='Comic' THEN 1 END) | +------------------------------------------+ | 5 | +------------------------------------------+
In MySQL, you can do the same thing with fewer words. In MySQL, the value TRUE is represented by 1 and FALSE by 0. So in MySQL, you can simply write:
mysql> SELECT SUM(status='Comic') FROM performer; +---------------------+ | SUM(status='Comic') | +---------------------+ | 5 | +---------------------+
|
The advantage of using a SUM rather than a filter is that you can count a bunch of different conditions in one statement. For instance, you can count comics and foils in the same query:
mysql> SELECT SUM(status='Comic') AS Comics -> ,SUM(status='Foil') AS Foils -> ,SUM(1) AS Total -> FROM performer; +--------+-------+-------+ | Comics | Foils | Total | +--------+-------+-------+ | 5 | 3 | 8 | +--------+-------+-------+
Alternatively, you can show the percentage of performers who are comics alongside the percentage of performers who are stout:
mysql> SELECT FLOOR(100*SUM(status='Comic')/SUM(1)) AS "Comics %" -> ,FLOOR(100*SUM(appearance='stout')/SUM(1)) AS "Stout %" -> FROM performer; +----------+---------+ | Comics % | Stout % | +----------+---------+ | 62 | 37 | +----------+---------+
You also can break this down to see the percentage of comics who are stout against the percentage of performers who are stout:
mysql> SELECT -> FLOOR(100*SUM(status='Comic' AND appearance='stout')/ -> SUM(status='Comic')) AS "Stout as % of Comic" -> ,FLOOR(100*SUM(appearance='stout')/SUM(1)) -> AS "Stout as % of Performer" -> FROM performer; +---------------------+-------------------------+ | Stout as % of Comic | Stout as % of Performer | +---------------------+-------------------------+ | 40 | 37 | +---------------------+-------------------------+
You can break this down further with a GROUP BY. If you want to see how the propensity to stoutness among comics varies between TV and cinema you can use:
mysql> SELECT -> medium -> ,FLOOR(100*SUM(status='Comic' AND appearance='stout')/ -> SUM(status='Comic')) -> AS "Stout as % of Comic" -> ,FLOOR(100*SUM(appearance='stout')/SUM(1)) -> AS "Stout as % of Performer" -> FROM performer -> GROUP BY medium; +--------+---------------------+-------------------------+ | medium | Stout as % of Comic | Stout as % of Performer | +--------+---------------------+-------------------------+ | Cinema | 25 | 33 | | TV | 100 | 50 | +--------+---------------------+-------------------------+
5.6.1.1. Access
Microsoft Access uses the value 1 for TRUE and 0 for FALSE, so with Access you must remember to negate your answer:
SELECT -SUM(status='Comic') FROM performers