Choosing Appropriate LIMIT Values

3.19.1 Problem

LIMIT doesn't seem to do what you want it to.

3.19.2 Solution

Be sure you understand what question you're asking. It may be that LIMIT is exposing some interesting subtleties in your data that you have not considered or are not aware of.

3.19.3 Discussion

LIMIT n is useful in conjunction with ORDER BY for selecting smallest or largest values from a result set. But does that actually give you the rows with the n smallest or largest values? Not necessarily! It does if your rows contain unique values, but not if there are duplicates. You may find it necessary to run a preliminary query first to help you choose the proper LIMIT value.

To see why this is, consider the following dataset, which shows the American League pitchers who won 15 or more games during the 2001 baseball season:

mysql> SELECT name, wins FROM al_winner -> ORDER BY wins DESC, name; +----------------+------+ | name | wins | +----------------+------+ | Mulder, Mark | 21 | | Clemens, Roger | 20 | | Moyer, Jamie | 20 | | Garcia, Freddy | 18 | | Hudson, Tim | 18 | | Abbott, Paul | 17 | | Mays, Joe | 17 | | Mussina, Mike | 17 | | Sabathia, C.C. | 17 | | Zito, Barry | 17 | | Buehrle, Mark | 16 | | Milton, Eric | 15 | | Pettitte, Andy | 15 | | Radke, Brad | 15 | | Sele, Aaron | 15 | +----------------+------+

If you want to know who won the most games, adding LIMIT 1 to the preceding query will give you the correct answer, because the maximum value is 21 and there is only one pitcher with that value (Mark Mulder). But what if you want the four highest game winners? The proper queries depend on what you mean by that, which can have various interpretations:

For this dataset, each method yields a different result. The moral is that the way you use LIMIT may require some thought about what you really want to know.

Категории