MySQL Cluster Errors
You may encounter errors at any stage. The following sections describe common errors and their solutions.
NDBD Won't Connect
You may experience an error such as the following when you attempt to start ndbd:
[root@storage] ndbd Unable to connect with connect string: nodeid=0,10.0.0.1:1186
This means one of two things:
- The management daemon (ndb_mgmd) has not been started.
- A firewall is preventing egress connections on port 1186 from the storage node to the management node or ingress egress on port 1186 on the management node from the storage node. It may also mean a firewall is preventing communication between cluster nodes on various ports around 2200. Firewall rules should allow communication between cluster nodes on all ports.
We strongly recommended that you not use a firewall with MySQL Cluster if at all possible. Cluster traffic is not secure, and you should not have it running on a public network. MySQL Cluster should always have a dedicated network for cluster traffic, and if clusters need to connect to the outside world, that connection should be made on a separate network with separate Ethernet cards and IP addresses.
Finally, you should be aware that many servers with recent motherboards have on-board dual Gigabit Ethernet. If you have only two storage nodes, you should be able to use a crossover cable with no switch between your storage nodes. This gives good performance, although you still have the problem of keeping your management server secure and keeping traffic between the storage node and management server secure.
NDBD Won't Start
ndbd may throw an error when starting. This may be trivial or may be a serious situation. There are several things to look out for if this happens, and common errors and solutions are listed here.
The following error means that the folder /var/lib/mysql-cluster does not exist:
Cannot become daemon: /var/lib/mysql-cluster/ndb_X.pid: open for write failed: no such file or directory
You need to create the folder /var/lib/mysql-cluster (or whatever you set DataDir to) and try again.
Error handler restarting system (unknown type: xxxxxxxxxxx) Error handler shutdown completed - exiting
This indicates a more serious error that requires further investigation. There are two places to look for errors. The first place to look, the DataDir on the storage node, now contains a file in the format ndb_pidXXXX_error.log, where XXXX is the process ID of the storage daemon. You need to open this, and you should get some serious hints, such as the following:
Type of error: error Message: Invalid Configuration fetched from Management Server Fault ID: 2350 Problem data: Unable to alloc node id Object of reference: Could not alloc node id at 10.0.0.1 port 1186: Connection done from wrong host imp 10.0.0.99. ProgramName: ndbd ProcessID: 2644 TraceFile: Version 5.0.11 (beta) ***EOM***
This shows that you have attempted to connect to the management server from the server 10.0.0.99, but that server is not in the config.ini on the management server, so it has told the node to go away (Connection done from wrong host ip 10.0.0.99).
The second place to look for errors is on the management host itself, where you find the file ndb_X_cluster.log (where X is the node ID of the management daemon, normally 1). In here, you get a lot of information as well as the errors, but the following line should appear somewhere fairly near the bottom:
2005-08-31 12:21:52 [MgmSrvr] WARNING -- Allocate nodeid (0) failed. Connection from ip 10.0.0.99. Returned error string "Connection done from wrong host ip 10.0.0.99."
This example is fairly easy to track down because the logs tell you exactly what is wrong. Many errors do not give you such clear messages. Firewalls have a nasty habit of causing many people hours of grief because depending on where the firewall is, the various nodes can chuck out different messages or no message at all. Generally speaking, if you have a firewall, you should shut it down while installing the cluster and getting it to work, and then you should try starting your firewall and restarting your cluster.
If you see odd errors, the first thing to check is whether communication between the nodes and the management node is working. The best way to do this is to look in the management node's log file to see if a similar error is in there as well as in the storage node's error log. If the error is in both places, you know immediately that the communication between the storage node and the management node is working fine. If it is not in both places, you know that you are looking for communication problems between the two. If communication between the two is not working, you can check for firewalls and use telnet ip.of.mgm. server 1186 from the storage node to try to connect. If the connection is refused, then either the management node did not start correctly, you have a networking problem between the nodes, or there is a firewall in the way on one of the nodes.
Arbitrator Decides to Shut Down a Node
When cluster nodes fail or networks connecting them fail, the nodes that make up the cluster undergo a complex set of decisions to decide what to do. Essentially, they attempt to avoid having two parts of the cluster split up and both of them continue to work, which results in a mess. Imagine taking two copies of your database, putting a copy on each of two machines for a few hours with lots of writes/updates to both machines, and then trying to merge the changes together at the end. It's a completely impossible task, so in order to keep your data consistent, the cluster must avoid such a partition. The technical name for such a mess is a "split brain" situation, where both halves of the brain get separated and although both sides are technically able to continue operating, one of them must shut down.
MySQL solves the split-brain problem by ensuring that even if the cluster is completely separated so a few nodes on one side of the partition cannot communicate with a few nodes on the other, they will not both continue to respond to queries; one half will be shut down. This means that MySQL might shut down nodes that are actually working fine as part of this process, and that is where the message "Arbitrator decided to shut down this node" comes from.
If you remove network cables between nodes for short periods of time, you will almost certainly discover that you can't just replace the network cable and continue on as usual. What happens is that if you unplug the network for long enough for the storage node to give up on connecting to the management node (or, more accurately, to fail to receive the critical number of heartbeats), the ndbd process will die.
This, however, is not split-brain because it is just one node dying. The following example helps explain the concept of split-brain and arbitration in more detail:
Five physical servers, Server1: Storage node 1 Server2: Storage node 2, SQL node 1 Server3: Storage node 3 Server4: Storage node 4, SQL node 2 Server5: Management node
Note
For the sake of simplicity, we assume that Server5 is the arbitrator in the cluster, which it is unless you have specifically configured another node to take over this role or unless the management node is dead, in which case the storage nodes elect one of their own to act as arbitrator.
If we set NumberOfReplicas as 2, you should know that the four storage nodes will divide themselves up into groups of two (so you will have two node groups, each with two nodes in it):
Node Group 1: Storage nodes 1 and 2 Node Group 2: Storage nodes 3 and 4
Now, imagine that Physical Server 1 and Physical Server 2 are connected to one hub and that Physical Servers 3, 4, and 5 are connected to another hub. The two hubs are connected to a switch. Now consider the various failure options:
- Physical Server 5 dies In this case, nothing happens. The cluster storage nodes elect a new arbitrator, and everything continues to work until the management daemon is restarted.
- Physical Server 4 dies Storage Node 3, on Physical Server 3 and in the same node group as Storage Node 4, takes over the complete load, and the cluster continues to work. As soon as Physical Server 4 recovers, ndbd is restarted and rejoins the cluster.
- The switch fails This is interesting. At the moment, there are two separate systems:
- System 1 Storage Node 1, Storage Node 2 and SQL Node 1
- System 2 Storage Node 3, Storage Node 4, Management Node 2, and SQL Node 2
Based on what you know already, you might think that both SQL nodes would continue to work; after all, both systems have a complete copy of the data and a SQL node. The first system does not have a management node, but as you know, the loss of a management node should not cause a cluster to crash.
As explained earlier in this section, however, you can imagine the problems if both SQL nodes continued accepting queries and then the switch was fixed. What would happen? You'd have a complete mess.
MySQL Cluster avoids this mess with the following logic, which each storage node goes through if it can no longer communicate with another node:
- Can I see more than 50% of the storage nodes in the cluster? If so, fine: Continue working.
- Can I see fewer than 50% of the storage nodes in the cluster? If so, not fine: Die.
- Can I see exactly 50% of the storage nodes in the cluster (as in the preceding example, where each storage node can see exactly two storage nodes, including itself, out of four). If so, can I see the arbitrator (typically, but not necessarily, the management node)? If so, fine: Continue working. Otherwise, die.
If you apply this logic to the preceding example, you can see that the first system will die because each storage node can see 50% of the storage nodes in the cluster but it cannot see the arbitrator (the management node).
This gets even more complicated if you have multiple node groups. Remember that each node group is made up of nodes with identical data on them. Therefore, if all nodes in any node group can see each other, network portioning is not an issue because clearly the other nodes cannot form a viable cluster without any node in that node group.
Therefore, in clusters with multiple node groups (that is, not where the number of replicas is equal to the number of nodes, such as in the basic two-node/two-replica cluster), the logic that each node follows in the event of it not being able to communicate with other nodes is as follows:
- Can I see at least one node in each node group? If not, shutdown (because I don't have access to a complete set of data).
- Can I see all nodes in any one node group? If so, stay alive; otherwise, die.
- If I can't see all nodes in any one node group but I can see at least one in each node group, then can I see the arbitrator? If so, stay alive. Otherwise, die to prevent split-brain problems.
If the cluster gets split in such a way that all nodes can still contact the arbitrator (but some cannot contact each other), the arbitrator keeps the first valid set of nodes to connect and tells the others to die. However, such a cluster will not normally be able to survive because storage nodes need to communicate with each other to complete queries.
Table Is Full
The error "Table is full" occurs if your DataMemory is set too low. Keep in mind that the default value for DataMemory is very low, so if it is not set in your config.ini, chances are that you will run out as soon as you try to import your tables.
Changing the DataMemory setting requires a restart of all storage nodes with the --initial parameter, so it is wise to set this parameter high so you do not encounter this problem.
Error 708
Error 708 means you do not have any more attribute metadata records remaining, which means you need to increase the value of MaxNoOfAttributes.
DBTUP Could Not Allocate Memory
The error "DBTUP could not allocate memory" can be caused by several things. First, it could result from your system being unable to address the amount of RAM that you have asked it to address. (Some Linux systems have a hard limit; FreeBSD has a maximum process size that can be changed by setting kern.maxdsiz=1000000000 in /boot/loader.conf.) If you have hit a hard limit, you can try running multiple storage nodes per server (see the section "Multiple Storage Nodes per Server," later in this chapter) and make sure you are running a recent version of your kernel.
The other cause of this problem occurs if you have set some configuration options far too high (for example, MaxNoOfLocalOperations=100000). This problem occurs because NDB uses a standard malloc() function and needs it to return a contiguous piece of memory for each memory area, but this can fail even if there is enough virtual memory and it can succeed even if there is not enough in some versions of Linux. If you are getting odd errors, it is best to reduce these parameters and see if that fixes the problem.
Multiple Storage Nodes per Server
You might face a problem allocating memory, getting an error such as "Memory allocation failure" or "DBTUP could not allocate memory for page," which may be due to your kernel not being able to allocate large quantities of RAM to one process. With FreeBSD, the simple solution is to add or change the following line in /boot/loader.conf:
kern.maxdsiz="1572864000"
Note
This is a value in bytes, so the preceding value allows 1.5GB per process; you can change this as appropriate. You should not set it to a value above the amount of your system RAM, and it is a good idea to set it a bit lower to avoid having your system swap excessively.
Note that changing this paramater does not allow you to avoid the maximum amount that any 32-bit architecture can allocate (that is, just under 4GB). This parameter just allows you to allocate more of the system RAM to one individual process.
The other solution is to run multiple ndbd processes on each server, each with a smaller amount of RAM usage. To do this, you need to specify different data directories for each process, and we recommend that you specify the ID as well. You should also make sure that each node group has nodes on different machines; otherwise, you will get a warning explaining that a host failure (that is, a failure of one server) will bring down your cluster. The easiest way of setting up multiple storage nodes per physical server in config.ini is to specify your nodes in the following order (assuming that NoOfReplicas is set to 2):
Server 1, node 1 (will become Node Group 1) Server 2, node 1 (will become Node Group 1) Server 1, node 2 (will become Node Group 2) Server 2, node 2 (will become Node Group 2)
Note
The reason for this order is that the cluster automatically forms the first node group out of the first NoOfReplicas nodes in the config.ini, so in this case, the first two nodes (because NoOfReplicas is set to 2) become Node Goup 1, and the final two become Node Group 2. You can experiment with modifying this config.ini to suit different values for NoOfReplicas.
For example, for a two data-node setup with two nodes with 1GB of DataMemory and 200MB of IndexMemory per process, you would use the following config.ini file:
[NDBD DEFAULT] NoOfReplicas=2 DataMemory=1000MB IndexMemory=200MB #Other parameters such as MaxNoOfTransactions would probably #need to be defined here for such a large cluster #*** MGM NODE *** [NDB_MGMD] id=1 HostName=192.168.254.20 #*** Storage Nodes *** #SERVER 1 - Node 1 [NDBD] id=10 HostName=192.168.254.21 DataDir= /var/lib/mysql-cluster #SERVER 2 - Node 1 [NDBD] ID=11 HostName=192.168.254.22 DataDir= /var/lib/mysql-cluster2 #SERVER 1 - Node 2 [NDBD] id=20 HostName=192.168.254.21 DataDir= /var/lib/mysql-cluster #SERVER 2 - Node 2 [NDBD] id=21 HostName=192.168.254.22 DataDir= /var/lib/mysql-cluster2 #*** SQL NODES *** [MYSQLD] Id=30 [MYSQLD] Id=31
You would then run a number of commands to start the cluster.
On the management server (192.158.254.20 in the preceding example), you would run the following:
[root@mgm] cd/var/lib/mysql-cluster [root@mgm] ndb_mgmd
On the first storage node (192.158.254.21 in the preceding example), you would run the following:
[root@storage1] ndbd --ndb-nodeid=10 --initial [root@storage1] ndbd --ndb-nodeid=20 --initial
On the second storage node (192.158.254.22 in the preceding example), you would run the following:
[root@storage2] ndbd --ndb-nodeid=11 --initial [root@storage2] ndbd --ndb-nodeid=21 --initial
You would then get the following output from running a SHOW command in the management client:
ndb_mgm> SHOW Connected to Management Server at: 192.168.254.20:1186 Cluster Configuration --------------------- [ndbd(NDB)] 4 node(s) id=10 @192.168.254.21 (Version: 5.0.13, Nodegroup: 0, Master) id=11 @192.168.254.22 (Version: 5.0.13, Nodegroup: 0) id=20 @192.168.254.21 (Version: 5.0.13, Nodegroup: 1) id=21 @192.168.254.22 (Version: 5.0.13, Nodegroup: 1) [ndb_mgmd(MGM)] 1 node(s) id=2 @192.168.254.20 (Version: 5.0.13) [mysqld(API)] 2 node(s) id=30 @192.168.254.22 (Version: 5.0.13) id=31 @192.168.254.21 (Version: 5.0.13)
You can extend this to run very large numbers of storage nodes per server, and if you have 64-bit storage nodes and very large amounts of RAM, you can run a very large number of storage nodes per machine. Even though 64-bit servers can potentially run just one storage node each with a very large amount of RAM, you might prefer to run a larger number of processes, each addressing less RAM, to make them easier to manage and control as well as help you track down freak crashes.
Cluster Nodes Apparently Randomly Shut Down or Crash
There are undoubtedly some bugs left in the current versions of MySQL Cluster (through 5.0), and you might be unlucky enough to hit one of them. In some situations, individual nodes or the entire cluster may shut down for some reason.
We cannot possibly describe the solution to every possible crashespecially because many crashes result from bugs that are fixed in future releasesbut we can give some general advice on what to do if such an event occurs:
- Back up your DataDir on all your nodes. You might then want to delete all the files in the DataDir on some nodes as part of the recovery process. Having a backup allows you to report a bug or figure out what went wrong after the smoke clears and you have recovered.
- If you have a full cluster shutdown, try to get one node in each node group working first. This can allow you to at least do a partial system restart and resume serving queries.
- If you find that after running ndbd, nothing happens for a while and then it suddenly exits, make sure to look in the log files (see the earlier advice on the location of log files). You can also see what phase it gets stuck in by using <id> STATUS in the management client while it is starting.(You can find a list of what each phase does in the section " Startup Phases" in Chapter1, "Installation.") Keep in mind that if you have a large database, it can take a very long time to actually start ndbd. A good thing to check is whether your RAM usage is increasing during startup and whether CPU and network usage are high. During different phases, you should either have a lot of CPU usage or have RAM usage steadily increasing. If neither of these things is occurring for a sustained period of time, you probably have a problem.
- Make sure you always have a very recent backup. MySQL Cluster supports online backups(see Chapter3, " Backup and Recovery"), so there is no excuse for not regularly backing up your cluster. This means that if for some reason you loose all your data, you simply recover config.ini from a backup (and it is always a good idea to keep config.ini on a floppy disk somewhere) and start all your storage nodes with --initial before importing the backup.
Note that recent versions of MySQL Cluster are very stable, and it is incredibly unusual to suffer the complete cluster shutdowns that were common in earlier releases of the software.
If you get odd errors, you can find help in several places:
- The cluster mailing lists Visit http://lists.mysql.com/cluster/ and search for the error code or description you are getting. You might find that you are not the first person to suffer from this problem.
- The MySQL Cluster forums At http://forums.mysql.com/list.php?25, run a search to see if others have experienced your error, and you may find a solution.
- Google A search string such as "NDB error <error #>" at Google can be quite effective. Make sure you also search Google Groups. If you have an error description, enclose it in quotation marks, and you might get fewer, but better, results.
If you can't track down any other mentions of your problem or if none of the remedies you find fix your problem, you should report the problem to the cluster mailing list, at cluster@lists.mysql.com, and someone will attempt to work out what went wrong. If you have hit a bug, you will probably be surprised at how fast a developer will produce a patch.
You need to make sure to explain exactly what you have done and include your config.ini file and a copy of the exact error you are getting, as well as a list of what you have already tried. You should try to give your message a sensible subject, such as "NDB error xxx" rather than "help please"; this makes it easier for others to find your thread if they suffer the same problem or have a solution.