Controlling Case Sensitivity in String Comparisons

4.10.1 Problem

A string comparison is case sensitive when you don't want it to be, or vice versa.

4.10.2 Solution

Alter the case sensitivity of the strings.

4.10.3 Discussion

The examples in previous sections were performed without regard to lettercase. But sometimes you need to make sure a string operation is case sensitive that would not otherwise be, or vice versa. This section describes how to do that for ordinary comparisons. Recipe 4.11 covers case sensitivity in pattern-matching operations.

String comparisons in MySQL are not case sensitive by default:

mysql> SELECT name, name = 'lead', name = 'LEAD' FROM metal; +----------+---------------+---------------+ | name | name = 'lead' | name = 'LEAD' | +----------+---------------+---------------+ | copper | 0 | 0 | | gold | 0 | 0 | | iron | 0 | 0 | | lead | 1 | 1 | | mercury | 0 | 0 | | platinum | 0 | 0 | | silver | 0 | 0 | | tin | 0 | 0 | +----------+---------------+---------------+

The lack of case sensitivity also applies to relative ordering comparisons:

mysql> SELECT name, name < 'lead', name < 'LEAD' FROM metal; +----------+---------------+---------------+ | name | name < 'lead' | name < 'LEAD' | +----------+---------------+---------------+ | copper | 1 | 1 | | gold | 1 | 1 | | iron | 1 | 1 | | lead | 0 | 0 | | mercury | 0 | 0 | | platinum | 0 | 0 | | silver | 0 | 0 | | tin | 0 | 0 | +----------+---------------+---------------+

If you're familiar with the ASCII collating order, you know that lowercase letters have higher ASCII codes than uppercase letters, so the results in the second comparison column of the preceding query may surprise you. Those results reflect that string ordering is done by default without regard for lettercase, so A and a both are considered lexically less than B.

String comparisons are case sensitive only if at least one of the operands is a binary string. To control case sensitivity in string comparisons, use the following techniques:

The same principles can be applied to string comparison functions. For example, STRCMP( ) takes two string arguments and returns -1, 0, or 1, depending on whether the first string is lexically less than, equal to, or greater than the second. Up through MySQL 4.0.0, STRCMP( ) is case sensitive; it always treats its arguments as binary strings, regardless of their actual type:

mysql> SELECT STRCMP('Abc','abc'), STRCMP('abc','abc'), STRCMP('abc','Abc'); +---------------------+---------------------+---------------------+ | STRCMP('Abc','abc') | STRCMP('abc','abc') | STRCMP('abc','Abc') | +---------------------+---------------------+---------------------+ | -1 | 0 | 1 | +---------------------+---------------------+---------------------+

However, as of MySQL 4.0.1, STRCMP( ) is not case sensitive:

mysql> SELECT STRCMP('Abc','abc'), STRCMP('abc','abc'), STRCMP('abc','Abc'); +---------------------+---------------------+---------------------+ | STRCMP('Abc','abc') | STRCMP('abc','abc') | STRCMP('abc','Abc') | +---------------------+---------------------+---------------------+ | 0 | 0 | 0 | +---------------------+---------------------+---------------------+

To preserve the pre-4.0.1 behavior, make one of the arguments a binary string:

mysql> SELECT STRCMP(BINARY 'Abc','abc'), STRCMP(BINARY 'abc','Abc'); +----------------------------+----------------------------+ | STRCMP(BINARY 'Abc','abc') | STRCMP(BINARY 'abc','Abc') | +----------------------------+----------------------------+ | -1 | 1 | +----------------------------+----------------------------+

By the way, take special note of the fact that zero and nonzero return values from STRCMP( ) indicate equality and inequality. This differs from the = comparison operator, which returns zero and nonzero for inequality and equality.

To avoid surprises in string comparisons, know the general rules that determine whether or not a string is binary:

In summary, comparisons are case sensitive if they involve a binary literal string or string expression, or a CHAR BINARY, VARCHAR BINARY, or BLOB column. Comparisons are not case sensitive if they involve only non-binary literal strings or string expressions, or CHAR, VARCHAR, ENUM, SET, or TEXT columns.

ENUM and SET columns are not case sensitive. Furthermore, because they are stored internally as numbers, you cannot declare them case sensitive in the table definition by adding the BINARY keyword. However, you can still use the BINARY keyword before ENUM or SET values in comparisons to produce a case sensitive operation.

Case Sensitivity and String Comparison Speed

In general, case-sensitive comparisons involving binary strings are slightly faster than non-case-sensitive comparisons, because MySQL need not take lettercase into account during the comparison.

If you find that you've declared a column using a type that is not suitable for the kind of comparisons for which you typically use it, use ALTER TABLE to change the type. Suppose you have a table in which you store news articles:

CREATE TABLE news ( id INT UNSIGNED NOT NULL AUTO_INCREMENT, article BLOB NOT NULL, PRIMARY KEY (id) );

Here the article column is declared as a BLOB, which is a case-sensitive type. Should you wish to convert the column so that it is not case sensitive, you can change the type from BLOB to TEXT using either of these ALTER TABLE statements:

ALTER TABLE news MODIFY article TEXT NOT NULL; ALTER TABLE news CHANGE article article TEXT NOT NULL;

Prior to MySQL 3.22.16, ALTER TABLE ... MODIFY is unavailable, in which case you can use only ALTER TABLE ... CHANGE. See Chapter 8 for more information.

Категории