MySQL Database Design and Tuning
< Day Day Up > |
A very large percentage of MySQL AB's customer base combines MySQL database software along with the Apache web server and PHP scripting language to produce inexpensive, yet powerful web-based solutions. This section provides suggestions on how to enhance performance in these configurations. Because the Microsoft Internet Information Services (IIS) also sees heavy usage for web-based MySQL applications, this chapter also reviews performance-related tuning for that web server. This section begins with server-independent topology, communication, and PHP configuration proposals before moving on to Apache-based and then IIS-centric suggestions. Optimal web server design and PHP techniques merit their own, dedicated books. Consequently, this section simply attempts to provide easily implemented performance tips for these technologies; administrators who want to employ more sophisticated or specialized tuning strategies are encouraged to first make these simple alterations before embarking on that path. In keeping with this book's attempt to service as broad-based a clientele as possible, this section does not spend much time exploring some of the more advanced (and potentially costly) load-balancing and caching products on the market, nor does it focus on purely web server performance related issues. Choosing the Right Server Topology
When designing your web and database server configuration, the first decision you face is where to run each of these important processes. For web servers that see relatively low hit rates, it might be sufficient to run both processes on the same hardware. However, even these small implementations benefit from a multi-CPU server; any division of labor can only help these CPU-intensive web and database server applications. This single-computer topology will not scale for larger, more heavily trafficked sites. In these cases, it's imperative that you place the web and database servers on their own, dedicated machines. To reduce unnecessary overhead, you should also review the operating system configuration suggestions found earlier in this chapter. Multiserver Configurations
When designing a multiserver topology, administrators often wonder how to determine the ratio of web to database servers that will yield optimal performance. The exact ratio is highly dependent on the usage patterns and site-specific stress levels that will be placed on the web and database servers, as well as the capacity of these computers. However, with these caveats in mind, it is still possible to provide some general guidelines on how to calculate the correct proportion of computing technologies. An example of a well-designed, thorough study can be found at http://office.microsoft.com/en-us/assistance/HA011607741033.aspx. Although this capacity planning guide does not focus on MySQL, it does illustrate a solid methodology as well as the number of variables that should be taken into consideration when embarking on this kind of research. In this case, the research shows that the optimal ratio of web to database servers is roughly 4:1. That is, for every four web servers, you should deploy a database server. Although total throughput can, of course, be increased by adding additional web and database servers, the ratio of these servers should remain at approximately 4:1. For example, if you want to utilize 12 web servers, you will need 3 database servers to preserve the 4:1 ratio. Realistically, if your web application environment boasts more than 10 web or database servers, it's probably astute to look at either a commercial or open source load-balancing solution. You might also implement caching or other techniques to equitably divide the workload. Chapters 16 and 17 ("Optimal Replication" and "Optimal Clustering," respectively) review MySQL technologies designed to help in these kinds of situations. Fast Interserver Communication
Regardless of your choice of web and database server topography, it's critical that the communication between these computers be as fast as possible. This means using the most rapid networking technologies that you can afford and taking any necessary steps to remove communication impediments. For example, if your web and database servers are resident in the same facility, they should communicate on a speedy, dedicated local area network (LAN) using the fastest network cards possible. Broader wide area networks (WANs) should be avoided if feasible. Unless the communications among these computers is not secure, time-consuming encryption technology such as Secure Sockets Layer (SSL) should obviously be avoided. General PHP Suggestions
Among its many uses, PHP has proven to be an enormously popular method of connecting web servers to MySQL databases. This section examines techniques you can use to improve PHP performance. Correctly Building PHP
If you are building your PHP installation from source code, rather than downloading a binary distribution, you have several ways to improve performance when you compile your PHP instance, including the following:
Correctly Configuring PHP
If your server-side PHP applications are likely to consume significant amounts of processor time or memory, it's essential that you configure your PHP initialization file correctly. In particular, make sure that the max_execution_time configuration parameter is set to a large enough number to accommodate your likely CPU utilization. Otherwise, you (or an unhappy user) will eventually see a message that looks something like this: [usr3292@DBSERVER3 apps]$ PHP Fatal error: Maximum execution time of 3 seconds exceeded in /apps/mysql/user_logger.php on line 41
Also, remember to allow for enough memory utilization by providing a good value for memory_limit. The following message indicates a PHP-based memory problem: PHP Fatal error: Allowed memory size of 256000 bytes exhausted (tried to allocate 32000 bytes) in Unknown on line 0
Finally, if you have elected to use shared memory for session handling (see the earlier section about building PHP), make sure to add session.save_handler=mm to your PHP initialization file. Caching PHP
A standard, unmodified PHP installation requires that before a given script can be run, the server must incur the overhead of accessing, loading, parsing, and, finally, compiling the script. These steps are taken because it's possible that a script might change between invocations. However, this is generally not the case in most situations. These costs can add up, especially for sites that see heavy traffic accessing a relatively small number of scripts. To address this bottleneck, a number of commercial vendors and open source projects (including Zend, ionCube, and APC) have released products that cache PHP scripts. By caching scripts in shared memory, these offerings help reduce the load on the web server, allowing better throughput and higher client service levels. A collection of independent and vendor-provided studies have estimated that PHP performance can improve by more than 40% when this caching technology is employed. If your environment sees heavy PHP workloads, it's worth your time to investigate the current state of the market for these tools. Your PHP applications might also benefit from server-side cursors or prepared statements; both topics are discussed in Chapter 9, "Developing High-Speed Applications." Apache/PHP Suggestions
With a market share approaching 70% (see http://news.netcraft.com/archives/web_server_survey.html for the latest statistics), Apache dominates the web server category. This section provides several ideas on how you can get more responsiveness out of your selection of this technology, especially when PHP is involved. Before beginning, keep in mind that the right server topology (that is, the optimal balance of web and database servers) can go a long way toward reducing or even eliminatingApache/PHP performance problems. If you haven't yet done so, it's a good idea to read that section of this chapter before starting your web server performance improvement odyssey. Understanding Your Potential Bottlenecks
Your ability to tune an Apache/PHP environment is highly dependent on your unique processing profile. For sites that see heavy downloading of static HTML pages, the network is typically the bottleneck. Large, static, graphically heavy pages can consume vast amounts of bandwidth even though your web and database server might be underutilized; users might report a sluggish server when, in fact, the problem lies entirely with the network. Reducing the amount of graphics or taking any other steps to make these pages smaller should help take some of the burden off the network. HTML compression (built in to most modern browsers) can also be a factor in improving things. On the other hand, a dynamically generated PHP-intensive environment can strain your CPU while your network goes virtually unused. For these situations, your best bet might be to make more use of static web pages, saving dynamic PHP page generation for only those situations in which it is absolutely necessary. Perhaps these pages can be generated on other servers or during off-hours. Adding extra processors can also be useful. Clearly, before you can possibly remedy any Apache/PHP performance problems, you must understand where the problems lie. Fortunately, as you saw in Chapter 2, numerous tools are available that you can use to spot the bottleneck and then take action. Upgrading to a Recent Version
It's logical to expect a continual stream of performance enhancements given the speed at which open source software improves. Apache is no exception to this rule. Before starting on a time-consuming optimization project, consider upgrading to a more recent version such as the 2.0 series. Build Suggestions
If you are building your Apache instance from source code instead of downloading a binary, make sure that you only incorporate relevant and necessary modules for your site. Many administrators make the mistake of including too many of the dozens of potential Apache modules into their instance. If your compilation creates a static executable, these extra modules will make the executable larger. Configuring Your Apache/PHP Environment
After you've set up the correct ratio of web and database servers and then downloaded and (optionally) built Apache and PHP, it's time to configure your environment for optimal performance. Table 14.2 lists a number of important settings and parameters that impact the speed of your Apache/PHP instance.
Other Apache parameters of interest to administrators are StartServers/StartThreads, MinSpareServers/MinSpareThreads, and MaxSpareServers/MaxSpareThreads. Internet Information Services (IIS)/PHP Tips
Although the combination of Apache and PHP represents the most frequently seen web server and scripting technology combination, there are sites running PHP in conjunction with IIS. This section describes several steps that administrators for these environments can take to increase overall performance. Improving Common Gateway Interface (CGI) Performance
Typically, PHP on IIS makes use of the Common Gateway Interface (CGI). However, several traits of CGI (including its continual process creation) have been identified as performance bottlenecks. Fortunately, the FastCGI open source solution (available from PHP version 4.05) addresses many of these issues, leading to dramatic response advances. As of version 4.3.0, standard PHP downloads (available from www.php.net) include the libraries for FastCGI. For the inquisitive, the FastCGI source code is also available from the same website. If you run heavy PHP volumes on IIS, it's a good idea to explore the benefits of incorporating FastCGI into your environment. Tuning IIS
Microsoft offers a collection of helpful tools to control IIS performance. Although not directly correlated to PHP response, it's still worthwhile to briefly describe these features. Because the correct settings for these parameters is so dependent on your site-specific hardware, processing, and network bandwidth profiles, this book does not make blanket recommendations. Instead, use these tools as part of an overall performance enhancement strategy, and remember to continue to change only one variable at a time. To begin, administrators can place a restriction on the amount of information flowing between clients and this web server (bandwidth throttling) as well as specify an upper limit to the number of active connections (see Figure 14.1). Figure 14.1. Setting limits on bandwidth and website connections.
Other important performance-related settings control how much time will pass before a session is terminated (connection timeout) as well as whether a user can invoke multiple, sequential requests within the same session (HTTP Keep-Alive). See Figure 14.2. Figure 14.2. Setting connection timeout, enabling HTTP Keep Alive.
Earlier in this chapter, you saw how HTTP compression can help boost performance by reducing network traffic. IIS administrators can elect to compress both static and dynamic content (see Figure 14.3). One comprehensive study of the challenges and benefits of HTTP compression can be found at http://www-106.ibm.com/developerworks/web/library/wa-httpcomp/. Figure 14.3. Enabling HTTP compression.
|
< Day Day Up > |