SQL Performance Tuning

   

The GROUP BY clause can explicitly appear in a SELECT statement, as in this example:

SELECT column1 FROM Table1 GROUP BY column1

Grouping also happens implicitly if there is a HAVING clause or a set function, as in this example:

SELECT COUNT(*) FROM Table1 HAVING COUNT(*) = 5

In standard SQLand with InterBase and Microsofta GROUP BY column may be followed by a COLLATE clause:

SELECT column1 FROM Table1 GROUP BY column1, column2 COLLATE SQL_Latin1_General

In a nonstandard SQL extension supported by Ingres, Microsoft, MySQL, Oracle, and Sybase, a GROUP BY column may contain an expression:

SELECT LOWER(column1) FROM Table1 GROUP BY LOWER(column1)

And in ANSI SQL:1999, IBM, Microsoft, and Oracle, a GROUP BY clause may contain CUBE or ROLLUP to indicate another level of grouping, like this:

SELECT column1, column2 FROM Table1 GROUP BY CUBE (column1, column2)

The SQL Standard says that the correct name for an aggregate function is " set function ," and the required set functions are AVG, COUNT, MAX, MIN, and SUM. In contrast, DBMS vendors prefer the term "aggregate function," and some provide extra built-in functionsfor example, for standard deviation (STDEV) and/or variance (VAR)in addition to the standard set functions. Some DBMSs also allow users to create their own aggregate functions. For example:

SELECT AVG(column1), STDEV(column1), UDF1(column1) FROM Table1 WHERE column1 > 55 GROUP BY column1 ORDER BY column1

Most of the features we've mentioned are supported by most DBMSs. Table 4-1 shows the SQL Standard requirements and the level of support the Big Eight have for GROUP BY.

Table 4-1. ANSI/DBMS GROUP BY Support

  Basic GROUP BY Expressions CREATE VIEW COLLATE Clause CUBE/ ROLLUP Max Columns Max Bytes
ANSI SQL Yes No Yes Yes Yes N/S N/S
IBM Yes No Yes No Yes >=20 254
Informix Yes No Yes No Yes >=20 >=2000
Ingres Yes Yes Yes No No >=20 >=2000
InterBase Yes No No Yes No >=20 >=2000
Microsoft Yes Yes Yes Yes Yes >=20 >=2000
MySQL Yes Yes No No No >=20 >=2000
Oracle Yes Yes Yes No Yes >=20 1969
Sybase Yes Yes Yes No No >=20 >=2000

Notes on Table 4-1:

  • Basic GROUP BY column

    This column is "Yes" if the DBMS supports basic GROUP BY syntax like:

    SELECT column1, MIN(column2) FROM Table1 GROUP BY column1

  • Expressions column

    This column is "Yes" if the DBMS supports expressions in GROUP BY, like this:

    SELECT UPPER(column1) FROM Table1 GROUP BY UPPER(column1)

  • CREATE VIEW column

    This column is "Yes" if the DBMS lets you put GROUP BY in a CREATE VIEW statement, like this:

    CREATE VIEW View1 AS SELECT column1, COUNT(column1) FROM Table1 GROUP BY column1

  • COLLATE Clause column

    This column is "Yes" if the DBMS supports ANSI SQL-style COLLATE clauses, or Oracle-style NLSSORT() function calls, or a CAST to a different character set with a different collation in GROUP BY, like this:

    SELECT column1, MIN(column2) FROM Table1 GROUP BY column1 COLLATE SQL_Latin1_General

  • CUBE/ROLLUP column

    This column is "Yes" if the DBMS supports CUBE and ROLLUP for summarizing, like this:

    SELECT column1, column2 FROM Table1 GROUP BY CUBE (column1, column2)

  • Max Columns column

    Shows how many columns may be listed in the GROUP BY clause.

    For Sybase, our tests showed it was possible to group at least 20 columns. This differs from Sybase's response to JDBC's getMaxColumnsInGroupBy call, which returns 16 .

  • Max Bytes column

    Shows the maximum allowed length, in bytes, of grouped values.

   

Категории