About the Upcoming Examples
For every significant tuning principle in the following chapters, we have provided at least one benchmarked example to illustrate the performance gains that can be obtained. However, you should be aware of the following:
- Any example is just thatan example. Your real-life performance might not show the same improvements that we obtained in our tests, and indeed you might find that some of the techniques shown do not work for you at all. Differences in data volumes and distributions, the MySQL version, and the storage engine you are usingas well as many other factorsmight result in significantly different outcomes. Nevertheless, the principles we outline are fairly general-purpose and should work for a wide range of applications and data types.
- All of our examples were done using MySQL 5.0 with either the InnoDB or MyISAM storage engine (with the InnoDB engine being our default). Many of the optimizations involved (index merges, for instance) appeared only in 5.0, and you will certainly see different results if you use a different storage engine such as HEAP or BDB.
- We looked only at "standard" SQL that is common to all of the storage engines. We felt that specialized operationssuch as full text search or spatial querieswere beyond the scope of this book, since our intention is to provide a foundation in SQL tuning with respect to stored program development only.
We used a Perl program (Mytrace.pl) to perform our tests. This program can take a normal SQL file, such as you might submit to the MySQL command-line client, and it generates several varieties of performance reports that we used to display the execution plans and the performance characteristics of our examples.
We could have used the MySQL command line to do our tests, but we decided to develop this utility for a number of reasons:
- The EXPLAIN output is a bit awkward. When the output is printed one line per row, the output can become garbled when wrapped to the column length. If the output is printed one line per column (with the G option), then the output appears very verbose. Either way, the output is hard to read. There is also no way to select which columns to display in the output.
- It's rather difficult to obtain the changed values from the SHOW STATUS statement that can reveal useful metrics such as logical or physical reads.
- For benchmarking purposes, we wanted to do things like averaging statistics over a number of executions, measuring statistics only on a second or subsequent execution so as to avoid discrepancies caused by caching of data.
- The utility was capable of generating comma-separated output that we could easily load into Excel to generate charts and perform analyses.
Mytrace.pl provides modified formats for EXPLAIN output and these formats are used throughout the next few chapters. We think you'll find this format easier to read and understand. For instance, whereas in the MySQL command-line client you might generate EXPLAIN output that looks like this:
mysql> EXPLAIN EXTENDED SELECT COUNT(*) FROM ta_5000 where sales_id in (select sales_id from tb_5000)G *************************** 1. row *************************** id: 1 select_type: PRIMARY table: ta_5000 type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 5131 Extra: Using where *************************** 2. row *************************** id: 2 select_type: DEPENDENT SUBQUERY table: tb_5000 type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 4808 Extra: Using where 2 rows in set, 1 warning (0.01 sec)
we would show the EXPLAIN in a more truncated format, as follows:
Short Explain ------------- 1 PRIMARY select(ALL) on ta_5000 using no key Using where 2 DEPENDENT SUBQUERY select(index_subquery) on tb_5000 using i_tb_5000 Using index
or in a more extended format like this:
Explain plan ------------ ID=1 Table=a Select type=SIMPLE Access type=ALL Rows=5158 Key= (Possible= ) Ref= Extra= ID=1 Table=b Select type=SIMPLE Access type=ref Rows=1 Key=i_tb_5000 (Possible=i_tb_5000 ) Ref=sqltune.a.SALES_ID Extra=Using index
The output also includes timings for each stage of statement execution and details of any SHOW STATUS variables that changed during execution:
Phase Elapsed (s) Parse 0.0001 Exec 1.3808 Fetch 0.0001 ----------------- Total 1.3810 Statistic Value ------------------------------------------------------------- Bytes_received 99 Bytes_sent 4862 Com_select 1 Handler_read_first 1 Handler_read_key 5003 Handler_read_rnd_next 5001 Innodb_buffer_pool_pages_data 57 Innodb_buffer_pool_pages_misc 7 Innodb_buffer_pool_read_requests 15217 Innodb_buffer_pool_reads 57 Innodb_data_read 933888 Innodb_data_reads 57 Innodb_pages_read 57 Innodb_rows_read 10000 Questions 2 Select_scan 1 Table_locks_immediate 2 Uptime 3
You may find the Mytrace.pl utility useful. You can find documentation and download instructions for this utility at this book's web site.