SQL Performance Tuning

   

To this point, we've talked only about physical representations, as if all tables are base tables. Does it matter if a table is a view? To find out, we tested three well-known propositions .

Portability

MySQL doesn't support views. The gains shown in this section are for only seven DBMSs.

Proposition one

A query of a view takes longer to prepare because it requires an extra lookup in the system catalog.

Calculated Fields

When you introduce redundancy, you risk error (because two occurrences of the value will be out of sync). The solution is to add a trigger, which will ensure that any change to A will also cause a change to B . So the question isIs the calculation cost greater than the maintenance cost? We've already noted that SELECTs are more common than UPDATEs. But arithmetic is cheap. Is there any sort of calculation that takes so long that it's worth some trouble to avoid it? Yesa calculation that involves multiple rows (and therefore involves multiple I/Os). The top three uses of calculated fields are:

  • To avoid the one-column join

    For example, you might have "Error 50" and the short explanation is "Lamination Drier Overheated." But you must always explain Error 50 when you displaythat is, it's a lookup of a code.

  • To avoid the double join

    For example, you have a hierarchy A B C and you frequently join A C without needing B (by "hierarchy," we mean many-to-one table relationships here).

  • To avoid the transaction join/join-and-sum

    For example, with a general ledger you have a running totalyou don't want to get the balance by looking at all the withdrawals and deposits since day one.

Simpler calculations aren't worth the trouble.

It turns out this proposition is true. To test it, we created a view on a simple table:

CREATE VIEW View1 AS SELECT * FROM Table1

Then we timed the relative speed of Query #1 as opposed to Query #2:

Query #1: SELECT * FROM View1 Query #2: SELECT * FROM Table1 GAIN: 7/7

Proposition two

It is faster to execute a prepared query on a materialized view twice, than to materialize the view twice. (A materialized view is a view whose rows take up space.)

It turns out this proposition is true too. To test it, we created a materializable view on a simple table:

CREATE VIEW View1 AS SELECT SUM(column1) AS sum_column1 FROM Table1

Then we timed the relative speed of Query #1 as opposed to Query #2:

Query #1: SELECT SUM(column1) FROM Table1 Query #2: SELECT * FROM View1 GAIN: 2/7

Note: We ran each query twice so that the prepared plan might still be in cache.

Proposition three

It is madness to use a view of a join or a union, unless you intend to use all columns .

It turns out this proposition is also true, provided that a primary-key/foreign-key relationship exists between the tables. To test it, we created these two tables, as well as a view on a join of Table1 and Table2 :

CREATE TABLE Table1 ( column1 INT PRIMARY KEY, column2 INT) CREATE TABLE Table2 ( column1 INT REFERENCES Table1, column2 INT) CREATE VIEW View1 AS SELECT Table1.column1 AS column1, Table2.column2 AS column2 FROM Table1, Table2 WHERE Table2.column1 = Table1.column1

Then we timed the relative speed of Query #1 as opposed to Query #2:

Query #1: SELECT DISTINCT column1 FROM View1 Query #2: SELECT DISTINCT column1 FROM Table2 GAIN: 7/7

Not a single DBMS detected what we were doing and threw away the join clause. Each one performed the useless join and so it's trueYou should avoid views of joins if you only need columns from one side of the join.

The Bottom Line: Views

A query of a view takes longer to prepare than a query of a table.

It is faster to execute a prepared query on a materialized view twice, than to materialize the view twice.

Don't query a view of a join or a union unless you intend to use all columns.

   

Категории