Conclusion
In this chapter, we looked at some more advanced SQL tuning scenarios.
We first looked at simple subqueries using the IN and EXISTS operators. As with joins and simple single-table queries, the most important factor in improving subquery performance is to create indexes that allow the subqueries to execute quickly. We also saw that when an appropriate index is not available, rewriting the subquery as a join can significantly improve performance.
The anti-join is a type of SQL operation that returns all rows from a table that do not have a matching row in a second table. These can be performed using NOT IN, NOT EXISTS, or LEFT JOIN operations. As with other subqueries, creating an index to support the subquery is the most important optimization. If no index exists to support the anti-join, then a NOT IN subquery will be more efficient than a NOT EXISTS or a LEFT JOIN.
We can also place subqueries in the FROM clausethese are sometimes referred to as inline views, unnamed views, or derived tables. Generally speaking, we should avoid this practice because the resulting "derived" tables will have no indexes and will perform poorly if they are joined to another table or if there are associated selection criteria in the WHERE clause. Named views are a much better option, since MySQL can "merge" the view definition into the calling query, which will allow the use of indexes if appropriate. However, views created with the TEMPTABLE option, or views that cannot take advantage of the MERGE algorithm (such as GROUP BY views), will exhibit similar performance to derived table queries.
When our SQL has an ORDER BY or GROUP BY condition, MySQL might need to sort the resulting data. We can tell if there has been a sort by the Using filesort tag in the Extra column of the EXPLAIN statement output. Large sorts can have a diabolical effect on our query performance, although we can improve performance by increasing the amount of memory available to the sort (by increasing SORT_BUFFER_SIZE). Alternately, we can create an index on the columns to be sorted. MySQL can then use that index to avoid the sort and thus improve performance.
We can achieve substantial improvements in performance by inserting multiple rows with each INSERT statement. If we are using a transactional storage engine such as InnoDB, we can improve the performance of any DML operations by reducing the frequency with which we commit data. However, we should never modify commit frequency at the expense of transactional integrity.
Most of our stored programs will perform only as well as the SQL that they contain. In the next chapter we will look at how to go the "last mile" by tuning the stored program code itself.