Microsoft Exchange 2000 Server Adminstrator's Companion
The Cluster service consists of several components that must cooperate in order for the cluster to function properly. This section briefly describes these components and how they relate to the overall Cluster service.
Node Manager
The Node Manager runs on each node in the cluster and maintains a local list of the nodes that belong to the cluster. It is the Node Manager that sends the heartbeat between the nodes, in the form of a User Datagram Protocol (UDP) packet sent every 1.2 seconds (Figure 20-11).
Figure 20-11. Packet trace showing UDP heartbeat packets.
Should one of the nodes fail, the other cluster node broadcasts a message to the entire cluster causing all members to verify their current membership list. This is called a regroup event. It is essential that all nodes in the cluster have exactly the same view of cluster membership for failover purposes. During the regroup event, the Cluster service prevents write operations to any common disk devices until the membership has stabilized.
REAL WORLD Heartbeat and Failover Architecture
The heartbeat between the servers in the cluster is very important. Understanding how the heartbeat's architecture works will aid in your administration of a cluster environment. We give you here a very basic introduction to how the heartbeat works given two different scenarios. To learn more about the heartbeat and failover architecture, consult the Microsoft Windows 2000 Server Resource Kit.
Scenario #1: Public Network Passes the HeartbeatIf a heartbeat is not received within two heartbeat periods and the LAN connection is configured for client-to-cluster communication, the Cluster service initiates a test for each node to determine its ability to communicate with external hosts. By definition, this test includes computers that are on the same subnet as the cluster nodes and that have a connection to a cluster node. The Cluster service uses the PING command to perform this test.
For instance, if the nodes in the cluster are unable to communicate with one another, but one of the nodes is able to communicate with an external host, network connectivity is still good and that node will take ownership of the cluster resources that are depending on LAN activity.
Scenario #2: Private Network Passes the HeartbeatIf the heartbeat is passing over a private LAN, there are no external computers to which a PING command can be sent. In this situation, the Cluster service must use the quorum resource to arbitrate which node should remain up and running.
Configuration Database Manager
The Configuration Database Manager (CDM) maintains the cluster configuration database. This database contains vital information about the cluster itself, such as the cluster node membership and resource types, as well as information regarding the specific resources, such as IP addresses and physical disks. The CDM runs on each node in the cluster, and each CDM communicates with its peers on the other nodes in the cluster to maintain persistent and consistent information across the cluster.
Log Manager
The Log Manager regularly verifies each node's copy of the cluster database to ensure consistency. Working with the Checkpoint Manager, it ensures that the quorum resource's recovery log contains the most recent cluster database information.
Checkpoint Manager
Cluster-aware applications, such as Exchange 2000 Server, use the configuration database to store information. Applications that are not cluster aware can store information in the node registry. The Checkpoint Manager maintains the node registry information, called checkpoint information, in the quorum resource recovery log. It also ensures that the quorum resource recovery log contains the most recent cluster database information.
Resource Manager
The Resource Manager is responsible for stopping and starting resources, managing their dependencies, and initiating failover of resource groups. It uses information received from the Resource Monitor to make its decisions.
Failover Manager
The Failover Manager decides which nodes should own which resource group. Once a node is chosen to own a resource group, the individual resources in that group are turned over to the Resource Manager for administration. When resources fail in any given resource group, the Failover Managers on each node work together to make a smooth transition of ownership of resources to another node in the cluster.
The algorithm for triggering a failover is the same whether the failover is planned or unexpected. The only difference is that in a planned failover, the services on the first server are shut down gracefully, whereas they are forcefully shut down in an unexpected failover.
When a node returns to operation, the Failover Manager can decide to move resource groups back to the recovered node. This is called a failback. Resource groups must have a preferred owner defined in order to fail back to a recovered server. The Cluster service protects against failback during peak processing times. You can also schedule failback for certain times of the day.
Event Processor
The Event Processor manages communications between applications and the Cluster service. It is the initial entry point for joining a server to a cluster. When a server is coming back on line, it is the Event Processor that calls the Node Manager to begin the process of joining or forming a cluster.
Resource Monitor
The Resource Monitor is a passive communication layer that acts as an intermediary between the Cluster service and the resource DLLs. When the Cluster service makes a call for a resource, the Resource Monitor transfers that request to the appropriate resource DLL. The Resource Monitor runs as a separate process in case the Cluster service fails, so that it can take all of the resources off line and make them available for failover to another node in the cluster.