CCNP BCMSN Exam Certification Guide (3rd Edition)
|
7-4. Managing Firewall Failover
By nature, firewall failover is a feature that can take action automatically, based on whether two firewalls are operational and connected. You might want to monitor or troubleshoot the failover mechanism on a failover pair so that you can verify its operation. As well, there might be occasions when you need to manually force the failover action between the peers. The following sections cover these topics. Displaying Information About Failover
When you connect to a firewall remotely, it isn't always apparent which unit is the active one. Because the active unit configuration is replicated to the standby unit, the command-line prompt (and the underlying host name) is identical on both units. This can make interacting with the correct firewall very difficult. After you connect to a firewall, use the show failover command to determine the state of that unit, as shown in the following example: Firewall# show failover Failover On Cable status: Normal Reconnect timeout 0:00:00 Poll frequency 15 seconds This host: Primary Active Active time: 2421015 (sec) Interface stateful (192.168.199.1): Normal Interface dmz2 (127.0.0.1): Link Down (Shutdown) Interface outside (192.168.1.1): Normal Interface inside (192.168.254.1): Normal Other host: Secondary Standby Active time: 0 (sec) Interface stateful (192.168.199.2): Normal Interface dmz2 (0.0.0.0): Link Down (Shutdown) Interface outside (192.168.1.2): Normal Interface inside (192.168.254.2): Normal Remember that you should make configuration changes to only the active unit, because those changes are replicated in only one directionactive to standby. Active-active failover takes this one step furtherconfiguration changes to the system execution space or the admin context must be made on the firewall unit that is active for failover group 1. If you attempt to configure the standby unit, the standby firewall displays a warning that the configurations are no longer synchronized. In the case of active-active failover, this gets a little more complicated. Now, a firewall can be either the primary or secondary unit, but it can be active in some contexts while being standby in others. You can find out which failover group the firewall is active in by using the show failover command in the system execution space, as shown in the following example: Firewall# show failover Failover On Cable status: N/A - LAN-based failover enabled Failover unit Primary Failover LAN Interface: Failover Ethernet2 (up) Unit Poll frequency 3 seconds, holdtime 9 seconds Interface Poll frequency 15 seconds Interface Policy 1 Monitored Interfaces 3 of 250 maximum Group 1 last failover at: 13:10:46 EST Dec 9 2004 Group 2 last failover at: 13:10:04 EST Dec 9 2004 This host: Primary Group 1 State: Active Active time: 149706 (sec) Group 2 State: Standby Ready Active time: 121650 (sec) [output omitted] Other host: Secondary Group 1 State: Standby Ready Active time: 120936 (sec) Group 2 State: Active Active time: 148995 (sec) If you can't enable failover, check the status of your firewall license with the show activation-key or show version command. The following example shows the results for a PIX Firewall running 7.0: Firewall# show activation-key Serial Number: 801021134 Running Activation Key: 0x7411c36d 0x639a94fa 0xa3f0b034 0x913c0374 0x3f3632ba License Features for this Platform: Maximum Physical Interfaces : 6 Maximum VLANs : 25 Inside Hosts : Unlimited Failover : Active/Active VPN-DES : Enabled VPN-3DES-AES : Enabled Cut-through Proxy : Enabled Guards : Enabled URL-filtering : Enabled Security Contexts : 5 GTP/GPRS : Enabled VPN Peers : Unlimited This machine has an Unrestricted (UR) license. The flash activation key is the SAME as the running key. Firewall#
In the example, the firewall has an "Unrestricted" license, which allows any type of standalone or failover operation, including "Active/Active" mode. Displaying the Current Failover Status
You can use the following command to display a summary of the current failover status: Firewall# show failover The output from this command displays the configured failover state (on or off), along with failover cable status, the last failover date and time, the failover roles (primary or secondary) for both units, the firewall role (active or standby) for both units, the status of each configured interface, and the statistics for the stateful failover link (if configured). PIX 7.x also presents this information for each failover group (1 and 2). Within each group, the status of each of the security contexts and its allocated interfaces are shown. For example, the system execution space on the primary firewall has the following output. Notice that at a glance, the shaded text gives a snapshot of every state and role involved in failover: Firewall# show failover Failover On Cable status: N/A - LAN-based failover enabled Failover unit Primary Failover LAN Interface: Failover Ethernet2 (up) Unit Poll frequency 3 seconds, holdtime 9 seconds Interface Poll frequency 15 seconds Interface Policy 1 Monitored Interfaces 3 of 250 maximum Group 1 last failover at: 13:11:02 EST Dec 7 2004 Group 2 last failover at: 15:01:04 EST Dec 7 2004 This host: Primary Group 1 State: Active Active time: 7536 (sec) Group 2 State: Standby Ready Active time: 663 (sec) admin Interface outside (192.168.93.138): Normal CustomerA Interface outside (192.168.93.139): Normal CustomerA Interface inside (192.168.200.10): Normal (Not-Monitored) CustomerB Interface outside (192.168.93.143): Normal CustomerB Interface inside (192.168.220.11): Normal (Not-Monitored) Other host: Secondary Group 1 State: Standby Ready Active time: 0 (sec) Group 2 State: Active Active time: 6879 (sec) admin Interface outside (128.163.93.141): Normal CustomerA Interface outside (128.163.93.142): Normal CustomerA Interface inside (192.168.200.11): Normal (Not-Monitored) CustomerB Interface outside (128.163.93.140): Normal CustomerB Interface inside (192.168.220.10): Normal (Not-Monitored) Stateful Failover Logical Update Statistics Link : Failover Ethernet2 (up) Stateful Obj xmit xerr rcv rerr General 135508407 7 53412868 0 sys cmd 266210 0 266207 0 up time 14 0 0 0 RPC services 0 0 0 0 TCP conn 123228648 0 47758798 0 UDP conn 663934 0 448445 0 ARP tbl 6 0 0 0 Xlate_Timeout 617643 0 556745 0 Logical Update Queue Information Cur Max Total Recv Q: 0 35 7519538 Xmit Q: 0 1 18562497 Firewall#
The Stateful Failover Logical Update Statistics represent the number of connection or table synchronization update messages that the firewall has transmitted and received. The Logical Update Queue Information shows the number of stateful update messages that have been queued as they have been transmitted to or received from the failover peer. Nonzero values mean that more updates have been queued than could be processed. A large value 125might indicate that the stateful failover bandwidth needs to be increased, usually by choosing a faster interface. To see the failover status information for just one failover group, you can use the following command: Firewall# show failover group {1 | 2} On a PIX 7.x Firewall, you can also get a quick summary of the failover status with the following command: Firewall# show failover state
In the following example, the firewall is shown to be the primary unit with the active role, and the other peer is the secondary in standby. The configurations are synchronized, and the interface MAC addresses have been set according to the primary and secondary burned-in addresses. If one of the units had failed, a reason would be shown: Firewall# show failover state ====My State=== Primary | Active | ====Other State=== Secondary | Standby | ====Configuration State=== Sync Done ====Communication State=== Mac set =========Failed Reason============== My Fail Reason: Other Fail Reason: Firewall#
Displaying the LAN-Based Failover Interface Status
An FWSM or a firewall running PIX 7.x can't display LAN-based failover interface statistics. However, a firewall running PIX 6.x will display this information if you use the following command: Firewall# show failover lan [detail]
For example, in the following output, the LAN-based failover interface is called lan-fo. It uses 192.168.1.1 and 192.168.1.2 on the two peers: Firewall# show failover lan LAN-based Failover is Active interface lan-fo (192.168.1.1): Normal, peer (192.168.1.2): Normal Firewall#
You could see much more detail about the interface activity by adding the detail keyword, as shown in the following example. Notice that statistics are kept for the number of failover messages sent, received, dropped, and so on, as well as the response times for message exchanges with the failover peer (the shaded text): Firewall# show failover lan detail LAN-based Failover is Active This PIX is Primary Command Interface is lan-fo My Command Interface IP is 192.168.198.1 Peer Command Interface IP is 192.168.198.2 My interface status is Normal Peer interface status is Normal Peer interface down time is 0x0 Total cmd msgs sent: 107856, rcvd: 107845, dropped: 1, retrans: 8, send_err: 0 Total secure msgs sent: 147375, rcvd: 147301 bad_signature: 0, bad_authen: 0, bad_hdr: 0, bad_osversion: 0, bad_length: 0 Total failed retx lck cnt: 0 Total/Cur/Max of 52719:0:3 msgs on retransQ, 52718 ack msgs Cur/Max of 0:7 msgs on txq Cur/Max of 0:34 msgs on rxq Number of blk allocation failure: 0, cmd failure: 0, Flapping: 0 Current cmd window: 3, Slow cmd Ifc cnt: 0 Cmd Link down: 17, down and up: 0, Window Limit: 17266 Number of fmsg allocation failure: 0, duplicate msgs: 0 Cmd Response Time History stat: < 100ms: 52681 100 - 250ms: 12 250 - 500ms: 13 500 - 750ms: 12 750 - 1000ms: 0 1000 - 2000ms: 4 2000 - 4000ms: 1 > 4000ms: 3 Cmd Response Retry History stat: Retry 0 = 52719, 1 = 4, 2 = 1, 3 = 1, 4 = 1 [output truncated]
Displaying a History of Failover State Changes
A firewall running PIX 7.x or FWSM 2.x keeps a running history of each time its failover state changes. Although the history events aren't recorded with a timestamp, the sequence of events can still be useful information. For example, if failover didn't come up correctly, you could trace through the history to see the sequence of state changes and the cause for each. You can see the history with the following command: Firewall# show failover history
For example, the following output shows the failover state change history for a firewall running in multiple-context mode. Failover groups 0 (for system execution space failover), 1, and 2 are listed, because failover operates independently in each group. This sequence of state changes occurred as failover was configured for the first time. During the No Active unit found changes, the secondary peer had not yet been configured for failover. Firewall# show failover history ========================================================================== Group From State To State Reason ========================================================================== 0 Active Applying Config Active Config Applied No Active unit found 0 Active Config Applied Active No Active unit found 1 Disabled Negotiation Failover state check 2 Disabled Negotiation Failover state check 2 Negotiation Cold Standby Detected an Active mate 1 Negotiation Just Active No Active unit found 1 Just Active Active Drain No Active unit found 1 Active Drain Active Applying Config No Active unit found 1 Active Applying Config Active Config Applied No Active unit found 1 Active Config Applied Active No Active unit found 2 Cold Standby Sync Config Detected an Active mate 2 Sync Config Sync File System Detected an Active mate 2 Sync File System Bulk Sync Detected an Active mate 2 Bulk Sync Standby Ready Detected an Active mate 2 Standby Ready Just Active Set by the CI config cmd 2 Just Active Active Drain Set by the CI config cmd 2 Active Drain Active Applying Config Set by the CI config cmd 2 Active Applying Config Active Config Applied Set by the CI config cmd 2 Active Config Applied Active Set by the CI config cmd 2 Active Standby Ready Set by the CI config cmd ========================================================================== Firewall#
Debugging Failover Activity
Table 7-1 summarizes some of the commands you can use to generate debugging information about firewall failover operation.
TIP Commands using the debug keyword produce real-time output for troubleshooting purposes. To see these messages, you must first enable logging output to the firewall console (logging console), to a Telnet or SSH session (logging monitor), to a logging buffer (logging buffered), or to a Syslog server (logging host). The debug output also must be sent to the Syslog destination with the logging debug-trace configuration command. See Chapter 9, "Firewall Logging," for more information.
Monitoring Stateful Failover
As soon as stateful failover is enabled, you should make sure your stateful failover interface isn't being overrun with stateful information packets. In other words, verify that the stateful interface bandwidth is sufficient for the load. Otherwise, information about some active connections will not be passed from the active to the standby firewall. If a failover occurs, these unknown connections are terminated. To do this in PIX 6.x and 7.x (single-context mode), you can make a quick manual estimate by using the show traffic command. Unfortunately, this command shows only cumulative values collected since the traffic counters were last cleared. For the packets-per-second and bytes-per-second values, a running average is computed since the counters were last cleared. However, you can issue the clear traffic command on the active firewall to clear the counters, wait 10 seconds, and issue the show traffic command. You should do this during a peak load time so that you see a snapshot of the busiest stateful information exchange. The following example shows how this is done: Firewall# clear traffic Firewall# show traffic stateful: received (in 9.050 secs): 3 packets 395 bytes 0 pkts/sec 43 bytes/sec transmitted (in 9.050 secs): 84 packets 98682 bytes 9 pkts/sec 10904 bytes/sec [output deleted]
In PIX 7.x multiple-context mode (active-active failover), things get a little more difficult. The interface used for stateful failover is defined and configured only in the system execution space, where there is no show traffic command. (That command is available in each security context; however, the stateful failover interface is not!) To gauge the stateful failover interface usage, you can use the show interface command instead. Issue that command and note the number of bytes shown. (This is a cumulative total, not a bytes-per-second rate.) Then, wait 10 seconds and issue the command again. Note the new byte count, subtract the two, and divide by 10. This gives you an estimate of the bytes per second being sent and received over the stateful interface. You can also use PIX Device Manager (PDM) to generate statistics or a utilization graph of a stateful LAN interface. Running the graph over a period of time shows you the maximum bit rate that has been used to transfer stateful information. Figure 7-9 shows a sample PDM graph. Figure 7-9. Using PDM or ASDM to Gauge Stateful Failover Traffic
Finally, the firewall performance itself affects the stateful failover operation. As stateful messages are generated, they are put into 256-byte memory blocks and placed in a queue before being sent to the failover peer. If the firewall cannot generate and send the stateful messages as fast as they are needed, more memory blocks are used. Although the firewall can allocate more 256-byte blocks as needed, the supply of these blocks can be exhausted in an extreme case. You can use the show blocks command as a gauge of the stateful failover performance. Over time, the 256-byte block "CNT" value should remain above 0. If it continues to hover around 0, the active firewall cannot keep the connection state information synchronized with the standby firewall. Most likely, a higher-performance firewall is needed. Manually Intervening in Failover
When the firewalls in a failover pair detect a failure and take action, they do not automatically revert to their original failover roles. For example, if the primary firewall is active and then fails, it is marked as failed, and the secondary firewall takes over the active role. After the primary unit is repaired and returned to service, it doesn't automatically reclaim the active role (unless it has been configured to pre-empt active control). You might occasionally find that you need to manually intervene in the failover process to force a role change or to reset a failover condition. The commands discussed in the following sections should be used from configuration mode in PIX 6.x and in the system execution space in multiple-context mode in PIX 7.x. Forcing a Role Change
Ordinarily, the firewalls fail over to each other automatically, without any intervention. However, they do not automatically fail back to their original roles. If for some reason you need to force one unit to become active again, you can use the following privileged EXEC command: Firewall# [no] failover active [group {1 | 2}]
You can also force a unit into standby mode with the no failover active command. For PIX 7.x with active-active failover, you can specify the failover group (1 or 2) that will become active. For example, suppose the secondary firewall should be standby for failover group 1 and active for failover group 2. After a failure, it ends up in standby mode for both failover groups, as shown in the following output: Firewall# show failover Failover On Cable status: N/A - LAN-based failover enabled Failover unit Primary Failover LAN Interface: Failover Ethernet2 (up) Unit Poll frequency 3 seconds, holdtime 9 seconds Interface Poll frequency 15 seconds Interface Policy 2 Monitored Interfaces 3 of 250 maximum Group 1 last failover at: 10:29:18 EST Jan 30 2005 Group 2 last failover at: 16:18:28 EST Mar 9 2005 This host: Secondary Group 1 State: Standby Ready Active time: 3311601 (sec) Group 2 State: Standby Ready Active time: 3304092 (sec)
To restore the secondary unit to the active role for failover group 2, you could take two different approaches:
Resetting a Failed Firewall Unit
If a firewall has been marked as failed but has been repaired or its connectivity restored, you might have to manually "unfail" it or reset its failover role. You can use the following privileged EXEC command: Firewall# failover reset [group {1 | 2}] You can use this command on either the active or failed unit. If it is issued on the active unit, the command is replicated to the failed unit, and only that unit's state is reset. In PIX 7.x, you can add the group keyword and failover group number for the firewall role to be reset. Reloading a Hung Standby Unit
Sometimes, an active and standby firewall can communicate over a failover connection but cannot synchronize their failover operation. In this case, you can manually force the standby unit to reload and reinitialize its failover role with the following PIX 7.x command: Firewall# failover reload-standby After the reload, it should resynchronize with the active unit. |
|