The Two Sides of the Cisco Unified CallManager Cluster
As a network administrator, you are accustomed to quite a bit of fiscal responsibility riding on your shoulders. Under the Cisco Unified Communications architecture, that responsibility has grown exponentially. Not only are you responsible for the operation of the data environment; you are now responsible for the company's voice network, which rides on top of this infrastructure. This voice network is critical to day-to-day business operations. Because of this, you should approach it just like any key network service: the more redundancy, the better.
A Cisco CallManager cluster is two or more servers grouped together to support a Cisco IP telephony network. The cluster relationship between Cisco CallManager servers provides redundancy and load balancing for the voice network. Cisco defines this cluster relationship in two ways: the SQL database structure and the intracluster run-time data.
The SQL Database Cluster
As discussed in Chapter 1, Cisco CallManager relies on Microsoft SQL 2000 as an information store for the data of the voice network. This data includes the phone extensions on the network, calling restrictions, route plan information, and so on. The database replication capability provided by Microsoft SQL Server makes clustering possible by allowing the same database to be on multiple machines. Database replication makes it appear as if a single machine is handling call processing along with other functions of the voice network and ensures that standby processors (Cisco CallManager servers) can seamlessly step in and fulfill the functions if the primary processor fails. This SQL database replication also ensures that all clustered Cisco CallManager servers have access to the same information.
You must have at least two Cisco CallManager servers to obtain this redundancy, and one of these servers must be a publisher database server. The publisher database server manages the only writable copy of the Microsoft SQL Server 2000 database. The subscriber database servers maintain read-only copies of the database. You can have only one publisher server and up to eight subscriber servers per cluster. It is these database servers that are able to actively participate in the call-processing functions of the Cisco voice network. This SQL limitation is also a key factor in determining the maximum size of a cluster, which is covered later in this chapter.
When you make changes to the Cisco CallManager configuration, these changes are made directly to the publisher server database. The publisher then replicates these changes to the subscriber servers. When the publisher server is offline, the Microsoft SQL Server 2000 database automatically locks, and thus prevents any database changes. The IP telephony network continues to operate, but you will not be able to add or configure any devices that are managed by Cisco CallManager. The only exception to this rule is the Call Detail Records (CDRs), which record information regarding the calls occurring within the cluster. When the publisher is down, the subscribers store CDRs until the publisher comes back online, and then the subscribers update the publisher with the CDRs.
In Cisco CallManager Release 3.3 and later, a single cluster is capable of handling approximately 30,000 Cisco IP Phones. This cluster limitation does not restrict the size of the voice over IP (VoIP) network. By creating additional clusters, you can increase the network size. However, the more clusters you create in a network, the more management the network requires to operate.
Note
The 30,000 IP Phone maximum cluster size is only possible if you are using the MCS-7845 servers throughout your cluster deployment.
Intracluster Run-Time Data
The second communication method that defines the cluster relationship between Cisco CallManager servers is the intracluster run-time data, which is also called Intra-Cluster Communication Signaling (ICCS). This type of communication encompasses the "happenings" of the cluster. For example, when a new Cisco IP Phone connects to the network, it registers with its primary Cisco CallManager. That primary Cisco CallManager tells all the other servers in the cluster, "Hey everyone, a new phone just registered with me! I've given it the extension 4003." (IP address 10.5.5.1 in this example). All the other servers now know to send calls directed at extension 4003 to the Cisco CallManager at the IP address 10.5.5.1, which, in turn, makes the Cisco IP Phone ring.
After the initial phone registration, the Cisco IP Phone sends keepalive messages to the primary Cisco CallManager server every 30 seconds and sends a TCP connect message (which is technically a TCP three-way handshake) to its secondary Cisco CallManager server to ensure it is online and ready to accept a device failover, if necessary. When the Cisco IP Phone detects the failure of its TCP keepalive messages with the primary Cisco CallManager, the device attempts to register with a secondary Cisco CallManager server. The secondary Cisco CallManager server accepts the registration from the device and announces the new registration (using intracluster run-time data) to all of the Cisco CallManager servers in the cluster.
Note
If the Cisco CallManager service is ever stopped manually (through the Services Windows 2000 Control Panel or a system shutdown), the Cisco CallManager server will clear all of its active TCP connections with the IP Phones. This causes them to failover immediately to their backup server rather than wait for the keepalive failure.
Cisco IP Phone registration is just one example of intracluster run-time data. You will discover many other types of intracluster communication as this book introduces new concepts in upcoming chapters.
Cluster Redundancy Designs
|