Load Balancing and Failover
A couple of different pieces need to be load balanced and set up for failover in MySQL Cluster:
- Automatic failover and load balancing within the data nodes This failover and load balancing happen automatically, totally transparently to your application. Generally, the application does not need to concern itself with this aspect of failover.
- Load balancing and failover in front of the MySQL nodes You have to set up this type of load balancing manually. This is normally a fairly simple task, thanks to the nature of MySQL Cluster, compared to something such as MySQL replication. It is possible to read/write from any MySQL server, so failover is as simple as connecting to another MySQL server and resuming whatever task you were doing. Load balancing is also normally quite simple because the query load is already partially balanced (by the data nodes), so, depending on the types of queries you are doing, the MySQL servers normally have less load already.
There are many possibilities for setting up load balancing and failover for the MySQL servers ("SQL nodes"). The following sections discuss some different commonly used options and their pros and cons.
JDBC Driver Support
The official MySQL JDBC, Connector/J, supports automatic failover and basic load balancing capabilities, starting from MySQL Cluster version 3.1. The load balancing algorithm is fairly basic round-robin support that chooses the server to connect to. In the event that it cannot connect or gets disconnected from a server, it automatically chooses another server in the list and uses that to do queries. This failover results in losing only the transaction you are currently running, so you have to retry just that transaction.
To set up the JDBC driver for this type of failover, you need to do two additional things compared to a normal setup. First, in the connection URL, you need to specify a list of all the possible MySQL servers. The list of servers would look like this:
jdbc:mysql://mysqld1,mysqld2,mysqld3,mysqld4/dbname
Second, you need to add some additional parameters to the URL. (It is also possible to set them through the use of the JDBC Properties object instead of through the direct URL.) The minimum required URL parameters that you need to set are autoReconnect=true, roundRobinLoadBalance=true, and failOverReadOnly=false. These settings cause the JDBC driver to connect to a new server in the event that the current one fails, they cause it to choose the servers in a round-robin fashion, and finally, they allow the client to read and write to the failed over server. There is also a special configuration option you can use that sets all three of these settings. You set this bundle by using useConfigs=clusterBase.
You may consider some additional parameters as well, such as queriesBeforeRetryMaster and secondsBeforeRetryMaster. These settings cause the JDBC driver to attempt to reconnect to the original server after a set number of queries or based on time. This can be useful for allowing a server to automatically resume being used after it returns to service.
Your final JDBC URL might look similar to this:
jdbc:mysql://host1,host2/db?useConfigs=clusterBase&secondsBeforeRetryMaster=60
Now when a MySQL server fails, only your current transaction will be aborted. You can then catch the exception and retry the transaction. The parameters you have set cause the JDBC driver to transparently connect to a new server, and when you issue the queries for the transaction, they execute there. The JDBC driver attempts to fail back to the old server every 60 seconds until it succeeds.
The advantage of this setup is that, compared to other solutions, it is relatively easy to do and doesn't require any additional hardware or software.
The obvious big disadvantage of this setup is that it only works for Java, and not for any other programming languages. Another drawback is that it isn't the most sophisticated of load balancing systems, and it may not work adequately in some scenarios.
Round-Robin DNS
It is possible to set up a very primitive load balancing setup by using round-robin DNS. With this method, you basically set up a DNS name, which can resolve to any of the MySQL servers operating in your cluster. Then you need to set up a very short time-to-live (TTL) for the DNS requests. After that, you tell your application to connect to the hostname, and it then gets a different IP address each time you request to connect. It is possible to remove a node from the DNS setup when it goes down or when you want to remove it from service.
One big drawback of this method is that it doesn't gracefully handle node failures automatically. In the event of a sudden MySQL server failure, the DNS server continues to give out the IP address for the now-down server. You can handle this in two ways:
- Your application can try to connect and, if it fails, request a new DNS entry, and then it can attempt to connect againrepeating as necessary until it can finally connect. The drawback with this method is that it can take a while for the connection attempt to fail, and it can fail quite a few times if you have multiple MySQL servers down.
- You can have some sort of monitoring system that detects a MySQL server shutdown and removes it from the DNS setup. This requires additional software, and if you are going to go this route, you should look into the software solutions mentioned later in this chapter, in the section "Software Load Balancing Solutions," because they can do this detection and failure in a more graceful way than round-robin DNS.
A second problem with round-robin DNS is that the load balancing is fairly simple, and it is not dynamic. DNS always does a round-robin request serving, which in some cases is less than ideal. Some clients also ignore very small TTLs, so if you have only a small number of client machines, you might find that by chance, they all end up hitting the same SQL node for a period of time.
Hardware Load Balancing Solutions
It is possible to use a hardware load balancer with MySQL Cluster as a front end for the SQL nodes. Cisco products and products such as Big-IP are examples of hardware solutions to this problem. Normally, these hardware solutions are quite customizable and reliable. When using them with MySQL Cluster, you need to make sure they are set up to bind connections at the protocol session level. That means they should balance a single connection to the same server for the duration of the connection. The method for doing this setup depends entirely on the hardware solution in use. You should consult your hardware vendor if you have any problems making this binding.
The advantage of hardware solutions is that, generally, you can customize them to do most of what you want. For example, most hardware solutions can automatically detect whether the MySQL server isn't responding, and they can then remove it automatically from the possible connections and add it back again when the server comes back up. In addition, they can possibly do more sophisticated load balancing solutions than just round-robin.
Generally, the biggest drawback of a hardware solution is cost. Typically, hardware solutions run in the tens of thousands of dollars, which may be more than the cost of your entire cluster. In addition, technically, a hardware load balancer can become a single point of failure. To minimize this, you need to set up redundant hardware load balancers, which increases the cost and complexity of the solution.
Software Load Balancing Solutions
A number of software solutions can be used for load balancing. The most common products that can do this are Linux Virtual Server (LVS), Linux-HA, and products based on these (such as Ultra Monkey). There are many other solutions for other operating systems that should be adequate as well. A brief description of what these particular systems do should help provide insight into any other system that you might want to use:
- LVS LVS(www.linuxvirtualserver.org) allows you to set up multiple Linux servers, all with a single virtual IP address. Behind the scenes, these servers can do load balancing. LVS can use many different algorithms, such as least connections, weighted round-robin, and many others. It allows for connections to be tied to a single real server, to ensure transactional integrity behind the scenes.
- Linux-HA The Linux-HA project (www.linux-ha.com) is designed to provide a flexible high-availability framework for Linux (and other UNIX systems as well). The main piece of Linux-HA is called Heartbeat, which is a server that monitors the MySQL servers and removes them from possible connections if they suffer critical failures.
- Ultra Monkey Ultra Monkey (www.ultramonkey.org) is a software solution that is designed to integrate Linux-HA and LVS into a single standalone solution, which allows for easier integration and building of fully load balanced and highly available solutions. The advantage of using software such as Ultra Monkey instead of the separate pieces is that it is easier to install, maintain, and configure a single-package solution.
There are a few big advantages to using a software solution such as these. First, all the software mentioned here is completely open source and freely available. Obviously, if you are designing a low-cost cluster, this is very attractive. Second, the software is very flexible. You can configure many different options, depending on the needs of your MySQL Cluster setup. This includes options such as load balancing algorithms, failover algorithms, rejoining algorithms, and so on.
The drawback of these software solutions is that they can be difficult to set up. While packages such as Ultra Monkey make setup easier, it is still generally more work to set up and maintain a system such as this than to use a hardware solution. Another possible drawback is that it can be more difficult to get support contracts to support these technologies. Many Linux distributions provide some support, and some commercial companies also provide support, but there is not a single source of support, as is possible with hardware solutions.