CCNP BCMSN Exam Certification Guide (3rd Edition)

2017-07-07 02:10:07

7-1. Firewall Failover Overview

When a single firewall is used in a network, the security it provides generally has the following attributes:

Lower cost Only one hardware platform and a software license are needed.

Single point of failure If the firewall hardware or software fails, no traffic can be forwarded from one side to the other.

Performance is limited The total throughput of the stateful inspection process is limited to the firewall's maximum performance.

If one firewall is potentially a single point of failure, it is logical to think that two firewalls would be better. Cisco firewalls can be made more available when they are configured to work as a failover pair. Firewall failover can operate in two different fashions: active-standby and active-active. The characteristics of each can be described as follows:

Two firewalls can act as an active-standby failover pair, having the following characteristics:
- Total cost is increased, because two firewalls must be used.
- The firewall pair can be physically separated, allowing no single point of failure.
- Performance is the same as that of a single firewall, because only one of the pair can actively inspect traffic at any time.
- If the active firewall fails, the standby firewall can take over traffic inspection.
- Active-standby failover is available on all Adaptive Security Appliance (ASA) and PIX platforms and software releases, as long as a single security context is configured. Firewall Services Module (FWSM) operates in active-standby failover whether in single- or multiple-context mode.

Two firewalls can act as an active-active failover pair, which requires two firewalls configured for multiple-security context mode. The characteristics of this functionality are as follows:
- Cost is doubled over that of a single firewall, because two fully functional firewalls must be used.
- The firewall pair can be physically separated.
- For each security context, one firewall takes on an active role, and the other is in standby mode.
- Performance can be doubled over that of a single firewall. The failover roles can be alternated across multiple contexts, allowing both firewalls to actively inspect traffic for different contexts simultaneously.
- If the active firewall for a context fails, the standby firewall for that context can take over traffic inspection.
- Active-active failover is available on PIX (515E, 525, and 535) and ASA platforms running release 7.x or greater.

How Failover Works

Firewall failover is currently available on the PIX 515E, 525, and 535 models; on the Catalyst 6500 FWSM; and on the ASA platforms.

Failover can be configured only if the firewall licensing enables it.

For active-standby failover, one firewall must have an "unrestricted" license, and the other has an "unrestricted" or "failover-only" license. The FWSM has active-standby failover enabled by default.

For active-active failover, both firewalls must have an "unrestricted" license. This is because both can actively inspect traffic at the same time.

Two identical firewall units can coexist as a failover or redundant pair by having their roles coordinated. In active-standby failover, one unit functions as the active unit and the other as the standby unit for all traffic inspection at any given time. One of the two firewalls always sits idle, waiting to take over the active role. Figure 7-1 illustrates this arrangement. The firewall on the left is active, and the one on the right is in standby mode.

Figure 7-1. Active-Standby Firewall Failover Concept

The two firewalls are in regular communication with each other over either a serial failover cable or a LAN-based connection. The firewalls can be configured for stateful failover so that the active unit keeps the standby unit synchronized with information about connections that are built or torn down. Each interface of the active unit must connect to the respective interface of the standby unit so that each firewall can monitor the health of the interfaces.

If a failure is detected on the active unit, the two firewalls effectively swap roles. Figure 7-2 shows this concept. The firewall on the right has moved from the standby role into the active role.

Figure 7-2. Active-Standby Failover After a Failure

In active-active failover, the firewalls still alternate their roles so that one unit is active and one is in standby. The difference is that the active-standby combination is carried out on a per-context basis, with each firewall running multiple security contexts. If the active-standby roles are alternated across different security contexts, both units can actively inspect traffic at the same timehence the term active-active failover, where neither unit is required to sit idle.

Figure 7-3 illustrates the active-active concept, in which each firewall is configured to run two separate security contexts, Context A and Context B. Now each context in one firewall can take on either the active or standby role, and the corresponding context in the other firewall takes on the alternate role. In the figure, the top firewall has the active role for Context A, and the bottom firewall is active for Context B.

Figure 7-3. Active-Active Firewall Failover Concept

If the active roles are divided appropriately across the firewalls, it becomes possible for both firewalls to be active on at least one context at any time. In other words, one whole firewall isn't required to sit idle.

During a failure in active-active failover, the two firewalls effectively swap roles, but only for contexts in which a failure is detected. In Figure 7-4, the entire top firewall has failed, rendering both of its contexts useless. The bottom firewall then takes on the active role for Context A and Context B, although it was already active for Context B.

Figure 7-4. Active-Active Failover After a Failure

Firewall Failover Roles

A failover pair of firewalls can be located together if needed. A pair of Catalyst 6500 FWSMs can even be located in a single-switch chassis. However, if the firewalls are geographically separated, they are less vulnerable to power or network outages or other disasters. Cisco firewalls can be separated and still function as a failover pair. Two FWSMs can also be split across a pair of switches.

The active unit performs all the firewall functions, whereas the standby only waits for the active unit to fail. At that time, the two units exchange roles until the next failure. In an active-active pair, the two exchange roles within each security context during a failure.

Configuration changes should always be made on the active unit. The firewall configurations are always coordinated between the two failover units using any of the following methods:

The active unit automatically updates the running configuration of the standby unit as commands are entered, so the two are always synchronized.

The copy running-config startup-config and write mem commands save the running configuration to Flash memory on the active unit and then to the Flash on the standby unit.

The write standby command can be used to force the running configuration to be replicated from the active unit to the standby unit.

NOTE

Only the running configuration is kept automatically synchronized between failover peers. The startup configuration is not affected until you manually synchronize the two units by using the write memory or copy running-config startup-config command on the active unit.

Also, each firewall maintains its own Flash file system. Files are not replicated across Flash file systems as a part of failover. This means that each firewall must maintain its own operating system and management application images. To upgrade a software image, you must upgrade each of the failover peers independently.

From a physical standpoint, one firewall is configured to be the primary unit, and the other becomes the secondary unit. These roles are used only to determine the IP and MAC addresses of the active and standby units, not the active and standby roles. The following actions are taken based on the primary and secondary designations:

The active unit takes on the MAC and IP addresses of the primary physical firewall on each interface.

The standby unit takes on the MAC and IP addresses of the secondary physical firewall on each interface.

The units toggle or swap these addresses after a failover occurs so that the addresses of the active unit interfaces are always consistent.

The primary and secondary roles can be determined by one of the following configurations:

Failover cable A 6-foot serial cable connects the two firewalls. The "primary" end of the cable connects to the primary firewall (the firewall with an "unrestricted" license) and the "secondary" end to the secondary firewall. Configuration changes are replicated over the cable at 115.2 kbps. (The failover cable is unavailable on the ASA or FWSM platforms.)

LAN-based failover The two units communicate across a LAN connection. The firewalls can be separated up to the distance limitation of the LAN media. Primary and secondary roles are manually configured. Configuration changes are replicated across the LAN at a high speed.

NOTE

The failover pair of firewalls must be exactly the same model and have at least the minimum amount of RAM, the same amount of Flash memory, identical operating system releases, and compatible failover licensing.

Beginning with FWSM release 2.2 and PIX release 7.x, each of the firewalls can run different operating system maintenance releases during an image upgrade. The "hitless upgrade" or "zero downtime upgrade" feature allows failover operation to continue as long as the pair of firewalls is running the same software major and minor release. For example, failover can continue if one unit runs PIX 7.0(5) while the other runs 7.0(7).

Detecting a Firewall Failure

Each interface of one firewall must connect to the same network as the corresponding interface of the other firewall. Each firewall can then monitor every active interface of its failover peer.

The active and standby firewalls determine a failure by sending hello messages to each other at regular intervals (every 15 seconds by default). These messages are sent over the failover cable (if present) or the LAN-based failover interface to detect failures of an entire firewall. The hellos are also sent on all interfaces configured for failover so that the firewall peer can determine the health of each interface. These messages are sent as short packets using IP protocol 105.

If a hello message is not received on the failover cable (or the failover LAN in a LAN-based failover) for three polling intervals (the default), the firewall declares the other unit "failed" and attempts to become the active unit. PIX 7.x has a configurable hold timer that must expire before declaring the other unit failed. You can shorten the hello and hold timers so that a failure is detected sooner if desired.

With a failover cable, a power failure or a reload on one firewall unit can be sensed on the other unit. Firewalls linked by a LAN-based failover connection can sense a peer's health only via the regular hello messages. If one firewall is powered off, its peer can detect the failure only by noticing the absence of several consecutive hellos.

Sometimes, a firewall interface (or the network providing its connectivity) might fail while the firewall stays operational. Failover peers can detect interface failures according to the following conditions:

If the two firewall units have changed failover roles or one of them has just powered up, the switch ports connected to the interfaces might move through the Spanning Tree Protocol states before forwarding traffic again. While a switch port is in the Listening and Learning states, regular data packets are not forwarded. This can cause failover hello messages to be dropped, causing the firewalls to begin testing their interfaces.
To prevent this from happening, a firewall interface enters the Waiting state for two hello message periods. If more hello messages are missed after that, the interface is tested. Otherwise, failover just monitors the interface normally. With the default hello interval (15 seconds), interface testing doesn't begin until 30 seconds after the interface changes state. This coordinates well with the default Spanning Tree Protocol timers, which can block traffic for two periods of the Forward Delay timer (15 seconds).

If a failover message is not seen on an interface within three polling intervals (the default), that interface is moved into a "testing" mode to determine if a failover is necessary. The other firewall is notified of the test via the failover cable or the LAN-based interface.

Interfaces in the "testing" mode are moved through the following sequence of tests:
- Interface status The interface is failed if the link status is down.
- Network activity If no packets are received over a 5-second interval, testing continues; otherwise, the interface can still be used.
- ARP The interface stimulates received traffic by sending Address Resolution Protocol (ARP) requests for the ten newest entries in the firewall's ARP table. If no traffic is received in 5 seconds, testing continues.
- Ping Traffic is stimulated by sending an Internet Control Message Protocol (ICMP) echo request to the broadcast address on the interface. If no replies are received over a 5-second interval, both the interface and the testing firewall unit are marked in a "failed" state.

At the conclusion of the tests, the two firewalls attempt to compare their status. If the standby firewall has more operational interfaces than the active unit, a failover occurs. However, if both units have similar failures, no failover occurs.

Failover Communication

Firewall pairs can support several different types of failover, depending on how they are configured. Each type allows the firewalls to communicate with each other in a slightly different manner:

Stateless failover The state of UDP and TCP connections is not kept when the standby firewall becomes active. All active connections are dropped and must be reestablished.

Stateful failover The state of UDP and TCP connections, as well as address translations (xlates), H.323, Serial Interface Protocol (SIP), and Multiple Gateway Control Point (MGCP) connections, are sent to the standby firewall over the stateful LAN interface. This stateful data is updated in real time as a stream of packets using IP protocol 8.

LAN-based failover Failover communication between the firewalls is carried over a LAN rather than the serial failover cable. Only failover hello messages and configuration replication updates are carried over the LAN-based connection.
LAN-based failover requires one physical interface to be set aside for failover traffic. If stateful failover is being used, too, it should have its own interface. However, it can be configured to share the same interface with LAN-based failover. The LAN-based failover interface cannot be a subinterface or a virtual LAN (VLAN) on a trunk interface.

Figure 7-5 illustrates the basic connections for a failover pair. One of the firewalls is always active and takes on the active IP addresses for all its interfaces. The other firewall is standby and takes on the standby addresses.

Figure 7-5. Basic Firewall Failover Pair Connections

Each firewall interface must connect to the same network IP subnet as the corresponding interface on the failover peer. For example, if the active unit's outside interface uses 192.168.177.1, the standby unit's outside interface must use an address in that same subnet. Obviously, each pair of peer interfaces must be connected on a common VLAN or Layer 2 network. In other words, the two firewalls must be able to send hello messages on an interface and reach the peer's interface without using a router or default gateway.

The failover pair can have a serial failover cable connecting them if they are in close proximity and do not need a high-speed link for configuration updates. The failover pair can also connect by a LAN-based failover link that uses a physical LAN interface on each firewall.

Stateful failover also requires a LAN-based link between the firewalls. This link can share the LAN-based failover interface, or it can be an interface set aside for stateful updates. A higher-speed interface is preferable, to support a high rate of connection state updates. In fact, you should choose the fastest interface that is present on the firewall platform so that state information can be replicated as fast as connections are formed.

Active-Active Failover Requirements

In active-active failover, the two firewalls are assigned the customary primary and secondary roles. You can give the primary or secondary unit priority for becoming the active unit on a per-context basis. This applies to the admin and any user contexts.

Because only two firewalls are permitted in a failover pair, there can be only two combinations of primary and secondary:

Aprimary, Bsecondary

Asecondary, Bprimary

Each of these combinations is called a failover group. Therefore, the contexts are assigned membership in one of the two failover groups.

Figure 7-6 shows the basic arrangement of the failover pair of firewalls, along with security contexts, failover groups, and firewall states (active or standby).

Figure 7-6. Active-Active Failover with Multiple Security Contexts

Within a failover group, either the primary or secondary firewall is given preference for becoming the active unit. One firewall can even preempt or usurp the active role if it doesn't already have it. This means that on a given firewall unit, all contexts in a failover group take on the same active or standby state. The same contexts on the other firewall take on the alternate state.

Because you configure each context's membership in a failover group, you can control the distribution of active contexts between the two physical firewalls. Some contexts might be heavily loaded, while others are not. Therefore, simply dividing the contexts evenly between the firewalls doesn't always result in an even distribution of firewall CPU, memory, and performance. You might have to experiment with context membership to maximize the use of each firewall.