Upgrading and Repairing Servers

The sad truth is that no matter where you locate your equipment, each location has issues with temperature, humidity, access, and other factors that are unique for that location and must be planned for. The important task, therefore, is to recognize what issues you face and build solutions for those problems into your design.

These are the essential issues you must plan for with any installation:

  • Size or capacity The location must have enough floor space, and the floor must have enough carrying capacity to hold your equipment. If you plan on expansion someday, you should plan to be able to access additional adjacent space when required.

  • Energy Servers and the equipment necessary to support them require a lot of power, so it's important that you be able to obtain the power you need when you need it. You also need to plan to condition the power so that it's suitable for use, as well as plan for outages.

  • Network connections It's essential to have reliable network connections of sufficient bandwidth. Often, data centers purchase two or three Internet connections from different ISPs, each on separate network trunks, for availability.

  • HVAC Cooling, heating, airflow, and humidification are requirements for any server room. Cooling in particular is an issue because servers run hot, and they shut down when they reach a certain temperature.

  • Security Systems must be put into place that restrict access to both hardware and software systems.

  • Physical access Doors, elevators, conduits, the size of aisles, and other physical attributes of the room(s) must be accounted for.

When you have a design in mind and have a plan drawn up, if you are building a new building or an extension to an existing business, you should check with the local government to ensure that all your plans are within the requirements of the local building codes. The approval process requires that you get two levels of approval: one for the design plans and the second for the completed project. You should also check with your insurance company to determine whether the room or building you are planning alters your coverage in any way.

Determining Size and Capacity

The size of your server room or data center is a gating factor in the type and distribution of equipment you can employ there. Bigger rooms offer the advantages of more space and potentially lower server density, but they also require correspondingly more facilities, such as HVAC. When they are properly designed, you can also get better airflow with bigger spaces, albeit at a higher cost.

A good place to begin laying out the design of a server room is with a design tool such as Visio. When you have the room size and shape laid out on your design surface, you can start to add room features, such as doors and windows, structural columns, and any other features that might affect where you place your system components. Visio contains a number of stencils that provide standard shapes of equipment types, and you can purchase third-party stencils of server room equipment. Figure 17.1 shows some of the stencils dedicated to network and server room design in Visio.

Figure 17.1. Visio, which has many dedicated equipment stencils, is a great tool for developing a space plan for a server room. You can also purchase third-party shape libraries of common server room equipment.

The next step in designing the room is to subdivide the room into squares the size of the server racks that you intend to deploy. Your grid squares can be the size of the panels used in your raised flooring as shown in Figure 17.2, or they can be a standard size, such as 3 feet by 3 feet, which is close to the size of a small server rack.

Figure 17.2. Shown here is raised flooring with bundled networking cable.

You need to create rows of servers on your design surface, and between the tows should be open aisles that are as wide as each equipment row. It's possible to use as little as 40% open aisle space and 60% equipment row space, but it's much better to have an equal size, so you have some leeway for accommodating oversized equipment if you decide to deploy some at any point.

For more information on using Visio for implementing server rack design, see "Visio Designs," p. 720.

Bob Snevely, in his book Enterprise Data Center Design and Methodology, ISBN 0130473936, which is part of the Sun Microsystems Blueprints series, developed the concept of rack location units (RLUs). An RLU assigns a weighting factor to the total requirements of the components of a rack system, with respect to the following demands:

  • Physical space

  • Connectivity

  • Power

  • Weight

  • Cooling

  • Functional capacity

The idea behind this approach is that you want to distribute dense server racks across the server room so that you don't overload the capacity of any one section with respect to any of the aforementioned factors. Let's say that you have a section of five server racks aligned side-by-side in a row. In this scenario, you would need to create a table that combined all five of those servers into one set of requirements, such as the following:

  • Physical space, including not only the size of the five server racks, but the size of the aisles that surround them, the size of any service columns or poles, and the size of row breaks and any additional space required to cool the assembly.

  • The number and type of Ethernet CAT5 cable, fiber-optic cable, or coaxial cables that must be run to the different servers in the group.

  • The amount of total powerthe kind of power in volts, amps, single- or three-phase, and outlet type required.

  • The point load and total tile load (for raised flooring) and the total floor load for the space being used. The total floor load should not exceed the carrying capacity of the floor.

  • The cooling requirement in British thermal units (BTU) per hour.

  • Functional capacity, which is a metric that defines how much computing power or storage capacity the assemblage has. The idea is that you can use the functional capacity to measure whether you have enough assets deployed to support a particular project or business goal.

Measuring Load Capacity

Load capacity is measured in a number of different ways. The first measure is the point load. If your server rack rolls on four metal casters or rollers, the server's load is distributed at four different points. A fully loaded large server can weigh as much as 3,200 pounds, so with such a server on four casters, each caster is has a load of 800 pounds; this is called the point load on an area of perhaps 1 square inch.

The second load capacity metric is the static load. Static load is the weight that an area of floor must be able to support. Say you have a 4-foot-square floor segment that supports 60% of the weight of a server rack. Because the rack is larger than that square, the static load for a 3,200-pound server would be 60%x 3,200 pounds/4, or 480 pounds per square foot.

Finally, the last metric you need to concern yourself with is the rolling load. As your rack system moves across the floor, its rolling load is similar to its static load, but the distribution of the weight varies, depending on how the rack is located. If you are using a raised floor, different locations on the raised flooring load tiles differently, depending on the location.

In some situations, a raised floor can support a static load, but the rolling load can lead to the collapse of a panel when too many points on the rack are located on the same tile. It's important to be aware of this problem and replace perforated tiles with stronger, solid tiles along the route of travel if that is required.

With all these measurements in hand, you can then sum them all to create a superset RLU that defines the assemblage. Your five racks might have the following specifications, among others:

  • Weigh 4,800 pounds

  • Require five 30A 220V outlets and one 40A 480V three-phase power connection, for a total of 12,000 watts

  • Need 82,000BTU per hour

If any one particular area can't support these requirements, you should move a dense server out of the group and place it somewhere else. To simplify the process of mixing and matching racks, you might assign a label to the different types of racks and use that label to balance your design. You could assign weighting factors to each requirement, such that your label might be RLU-142331, where each number is the overall assignment of each factor.

When you sum five different RLUs, you get a composite number that lets you determine whether you have distributed servers appropriately. Sneveley's assignment uses a simpler scheme of RLA-A, RLA-B, and so forth, but it's possible and probably preferable to extend this idea to make it a little more quantitative. The total RLUs determine what is called the "in-feed capacities" of the system. This total is not the complete story, of course. Your calculations need to account for not only the amounts of each resource but their types as well.

Planning for Cable Management

Chapter 16, "Server Racks and Blades," briefly touches on the number of cables that run in and out of dense server racks. As you may recall, a server rack with 24 servers can have from 200 to 250 cables going into the connections at the back of the server. Without some form of organization and identification, it can become impossible to find a broken connection or modify your current connections.

The rules for cable management are few:

  • Label everything and document what you have labeled. Label your cables on both ends, near the connectors. (Some experts recommend labeling cable every 6 to 10 feet, in case you need to troubleshoot the connection under raised flooring.) Use a numbering scheme that allows for sufficient cable count so that you can accommodate growth.

  • Keep it simple and keep it organized. Wires going to the same locations should be bundled either below a raised floor, above your equipment, or on cable runs or using some other organization tool.

  • Use colored cables and document what the colors mean.

  • Avoid using excess cable lengths and never leave open or disconnected cabling. Excess and unused cable is an invitation for the creation of spaghetti cabling, which is both dangerous and a waste of time.

  • Use patch panels effectively. Patch panels support CAT5 Ethernet, Fibre Channel, coaxial, fiber-optic cable, and often a custom mix. Label connections at the patch panel, and, if possible, use matching color coding for the wires that connect to them.

  • Bundle all cables leading to and away from patch panels. There are different methods for bundling cable, including using cable ties (which come in different colors), routing pipes, wire hooks, and so forth. You can purchase entire cable management systems from several vendors.

  • Whenever possible, minimize the number of cables used because each cable and connection represents a potential point of failure. However, maintain redundancy in your cabling scheme so that if a path fails, another path is still open. Obviously, these two rules must be balanced with good common sense.

  • Use good-quality insulated cables and keep them away from heat and electromagnetic sources. Don't mix data transmission cables with power cables. Insulate power cables whenever possible.

In a disorganized system, changes that should take seconds can take many minutes, if it's possible to effect the change at all. Without cable management, you can have a rat's nest of wires that impedes airflow and can be a hazard. If you want to know how professional an IT organization is, one of the simplest ways to tell is to look at its cable management scheme on the back of its servers.

The simplest cable management systems are those that have a hook or basket arrangement. One site that offers a number of cable management solutions is Cableorganizer.com (www.cableorganizer.com). You can install cable management systems above head height and out of the way. Many server racks come with cable management systems that run at the top of the server rack, as you can see in the APC InfraStruXure rack series shown later in this chapter, in Figure 17.5.

If you install a raised floor (see "Considering Raised Flooring," later in this chapter), you are likely to install cable trays as part of your floor support. Cable trays are U-shaped wire baskets onto which the wire is placed. Cable trays are placed so that the wires run parallel to the aisles, thus allowing you to have access to the trays at any point along the run.

Determining Power Needs

Electrical consumption is one of the metrics that you need to plan for. As a general rule, you should figure on about 50 watts of power per square foot for a small server to as much as 100 watts per square foot for denser deployments.

In addition to specifying the individual rack power requirements, you also need to specify the total room's power needs. For 10 racks, that figure would be 720 amps. Of course, other equipment in a server room can significantly draw on power. Among the larger consumers of power in a server room are HVAC, switches, printers, lighting, and UPS devices. Therefore, you should consider all those factors when allowing for your power requirements.

If you have a server rack with 36 1U servers in it, and you figure that each server draws approximately 2 amps, then one rack would consume 72 amps. Current power consumption for servers is averaging over 90% of the server's stated load. Given that you want extra capacity for a rack of this type, you might want to allow for twice the amount of amperage during power burstssay, 150 ampsbecause most racks include other devices, such as arrays, tape backups, rack ventilation, and so forth.

Note

Large power lines generate magnetic fields that can be a problem for network communications. You need to shield any large power lines so that they don't affect other systems. Many server rooms choose to shield their power wiring inside flexible steel pipe or conduits, often encased in braided copper wire sheaths in order to minimize electromagnetic interference (EMI).

Also keep in mind that elevators often have large motors with magnetic mechanisms in them and that you can have problems from that source as well. When possible, you shouldn't have an elevator near a server room. If you have EMI, you might want to invest in electromagnetic shielding material and use it to line your server rooms' walls.

As much as possible, you should have redundant power inputs to your equipment. In such a system, if one circuit supplying power fails, the second circuit picks up the load. While redundancy is an overriding theme in this chapter, many data centers do not provide duplicate power inputs to their equipment, relying on their UPS equipment to switch on when there is a power failure and to provide for enough time for admins to switch manually over to the second power circuit or to fix the problem.

According to the American Power Conversion (APC) whitepaper "Guidelines for Specification of Data Center Power Density," a typical 5,000-square-foot data center has a power draw of around 50kW at 480 volts, with the following electrical requirements:

  • 50% cooling system (where it is assumed that 1kW use requires 1kW cooling)

  • 36% critical loads (servers and other systems)

  • 11% UPS inefficiencies and battery charging

  • 3% electrical lighting

Analysis of a server room or data center's power needs starts with determining the amount of power drawn as part of the critical load. You could start by enumerating the average power requirements of each piece of equipment in the room and add an extra cushion for peak loads. The manufacturer of the equipment should list power consumption either on its specification sheet or on a nameplate that is placed on the equipment. Enumeration can be tedious, particularly when you are dealing with server racks containing different manufacturer's equipment.

Note

You might wonder how kilowatts (kW) relate to kilovolt amps (kVA), as both are used in power measurements. Kilowattsare the real power that can be drawn from a system. Kilovolt amps include the power you can draw plus the residual power in the system that you can't draw out. Therefore, a UPS device is often rated as kW/kVA, which is called the power factor. A computer's ability to draw (that is, its power factor) approaches 1.0 or unity, but modern UPS devices have power ratings in the range of 0.8 to 0.9, depending on the design type. Some designs have a rating as low as 0.65, so this is one factor to take into account. A device with a rating of 0.65 at 200kVA supplies only 1.3kW of power.

When you have determined what your current and future electrical requirements are, you need to multiply that figure by a factor from 25% to 50% to leave sufficient overhead to deal with peaks. The amount of overage you need may be designated as part of your building code, so you need to check. Most power comes into a facility as 480V three-phase AC in the United States and 230V phase AC elsewhere. Where a critical load of 1.25X is used, the current required for critical load is as follows:

Amps needed = (kWx1,000)/(Voltsx1.73)

The cooling factor needs to be equal to at least the critical load at peak, with a certain amount of reserve. That's why a 36% critical load requires 50% cooling.

The rated load of the power equipment used may be as much as four times the critical load, while the steady state load is rated at 1.25X. At this point, it should be possible to estimate the size and type of any backup power generators as well as the size and nature of your UPS equipment.

In summary, you need to take the following steps to determine your power requirements:

1.

Determine the power requirement of each individual component in your server room for its peak load.

2.

Find the total and multiply it by the overage factor that your building codes or your site requires.

3.

Determine the type of power that is required and where access to the power must be located.

4.

Determine the cost and location of the resources needed to deliver these power requirements.

5.

Determine the amount of backup power you need, as required by both your electrical draw and the amount of time you want to be able to be on backup power.

Determining your power needs and how to deploy your electrical connections is definitely something you don't want to do by yourself. This is an area where it makes sense to consult an electrical engineer in order to make sure that each piece of equipment gets the power it needs and is properly protected. You need help figuring out just where to place your outlets, whether you need to deploy flexible power cable outlets (sometimes called power whips), as well as the nature of the type of power supplied to your systems. Some larger systems that draw power need more heavy-duty power supplies, such as three-phase 480V power. It's really important to try to balance the power needs of your equipment across your electrical circuits, and that definitely influences where you put your densest server racks and more powerful systems.

Calculating UPS Needs

As discussed in more detail in Chapter 14, "Power Protection," UPS (uninterruptible power supply) is a backup battery system that provides power when your main power fails. Every server should be backed up by some kind of UPS system, so it is helpful to know a little bit about the kinds of UPS systems on the market as well as how to calculate how much UPS capacity you need.

In an age when power brownouts are common during peak demand, UPS systems are also often called on to condition the power supply. By conditioning we mean that the power is monitored and maintained within a certain tolerance so that it is always at a constant voltage and frequency. You will certainly want to check the quality of the power in your building and determine whether it conforms to American National Standards Institute (ANSI) standards for power quality. If it doesn't, you should look for this feature in your UPS devices or invest in special power conditioning equipment.

In calculating UPS capacity, you want to balance the amount of backup powered time available against the cost of the system. For absolutely mission-critical servers, the solution isn't a UPS system but a backup power system, with a UPS perhaps serving to allow for successful transition to the backup power. APC has a UPS product selector at www.apc.com/tools/ups_selector/index.cfm.

UPS devices come in several different types:

  • Standby This is the type of UPS used on PCs and workstations, as well as on small servers. A standby UPS device contains a surge suppressor and a voltage filter, in line with a switch that power can flow through. As long as power is available, most of it flows down this circuit, with a little of the current being used to keep the battery topped up. When the power fails, the transfer switch moves the connection from the power on circuit to the battery circuit, where a DC-to-AC converter puts the power into a form that computers and monitors can use.

  • Line-interactive This type of UPS is used to back up small groups of computers. In this design, the power line goes first through a transfer switch and then flows through the inverter, which supplies power to the computer(s) and to the battery for charging. When the power fails, the transfer switch opens and the power in the battery discharges. The advantage of a line-interactive UPS is that it offers better voltage regulation and larger battery sizes. You'll find this to be the dominant design for the 0.5kVA to 5kVA range.

  • Standby-ferro This type of UPS isn't commonly used anymore, but it is a reliable system that offers superior line filtering. The main electric line runs through a transfer switch to a transformer and then on to the computers. The secondary power line runs through a battery charger, charges a standby battery, runs through an AC/DC converter and then onto the transformer and the computers. Standby-ferro UPS systems have problems supplying the kinds of loads that current computer systems generate. A server's power draw requires a smooth AC power, but a ferro-resonant UPS's transformers create a current that lags the voltage, resulting in a ringing power signal that can cause power surges. While this used to be the dominant UPS in the 3kVA to 15kVA range, it has been replaced by the double conversion online UPS described next.

  • Double conversion online This type of UPS is the most commonly used UPS for power ratings above 10kVA. In a double conversion system, the main power runs through an AC/DC power rectifier, charges a battery, and proceeds through an AC/DC inverter on its way to computers. A second line bypasses all these components, going directly from input to output with a static bypass switch. When power fails, this design doesn't throw the transfer switch, and because the battery is on the main power line, there is no transient transfer time for the backup power to kick in. The advantage of this system is that it has very good electrical characteristics, but this design also has the disadvantages of having large component stress as well as sometimes drawing large amounts of powers from the building's power system.

  • Delta conversion online This UPS technology fixes some of the deficiencies of the double conversion online design. This design has an inverter that supplies the load voltage, as well as an added delta converter that also adds a power circuit to the inverter's output. When power fails, the design operates similarly to a double conversion system, but a delta converter conditions the input power so that it is sinusoidal, and it also controls the input current so that the battery charge doesn't overwhelm the electrical system. Today the delta conversion system is the dominant large UPS system, and it is used for power ranges from 5kVA to 1.6mW.

Figure 17.3 illustrates the power circuits for the different kinds of UPS devices, and Table 17.1 shows the different types of UPS systems.

Figure 17.3. Power circuits for the different kinds of UPS systems.

Table 17.1. The Different Types of UPS Systems

Type

Power (kVA)

Voltage Conditioning

Cost per VA

Efficiency

Inverter

Used with

Standby

00.5

Low

Low

Very high

No

PCs, workstations

Line-interactive

0.55

Design dependent

Medium

Very high

Design dependent

Racks, stand alone servers, and poor power environments (most popular UPS)

Standby-ferro

315

High

High

Low to medium

No

Banks of redundant UPS systems (not widely used)

Double conversion online

55,000

High

Medium

Lowmedium

Yes

Banks of redundant UPS systems

Delta conversion online

55,000

High

Medium

High

Yes

Widely used in all types of server environments

Now that you know what kinds of UPSs are available, you need to specify which kinds of UPSs and how much capacity you need. For standard servers, you can get by with a single backup system, referred to as an N topology, where there are no redundancies. An N topology should be able to supply a full load (100%) to critical systems for the amount of time you deem necessary. To add redundancy you might want to move up to an N+1 topology, where an additional UPS device is added to any number (N) of UPSs.

While a server can run at 90% load, a UPS cannot. If a double conversion or delta conversion UPS surges over its stated load, the device shuts down and goes into a utility bypass mode. Power to backed-up systems is lost. UPS devices are very unforgiving in that way.

Finally, for mission-critical systems, you might have a 2N topology, where each and every UPS system is backed up by another UPS system. However, for mission-critical systems, what you really want is a backup generator to keep them going. A UPS for a mission-critical system is your very last line of defense and should give you enough time to repair or replace a generator.

Considering Raised Flooring

There was a time when old mainframes were put into chilled rooms with raised flooring because these mainframes contained a large number of mechanical devices, vacuum tubes, and other equipment that ran very hot. Raised flooring was just one of the ways of keeping those behemoths cool. While times have changes in many ways, using a raised floor and the correct ventilation are still effective means of greatly increasing airflow in a room, by perhaps 50% or more. Raised floors are usually from 12 inches to 24 inches off the ground. That allows duct work to flow to the air intakes that many cabinets and racks have, which in turn makes for greatly enhanced airflow.

A raised floor is generally constructed by placing a metal load-bearing framework or support grid onto the server room floor. The assembly looks a little bit like an erector set. Most raised floors use tiles or panels that are 2 feet by 2 feet, or 4 feet square. The space below the panels is called the plenum, and tiles can be either solid, perforated, or grated. One of the nicest features of raised flooring is the ability to shape your cooling system flow by using the floor panels. Raised flooring requires additional ceiling height in the server room of as much as 2 feet, which can add to building costs.

Raised flooring also offers additional benefits such as providing a place to run cabling and to remove power lines from sight. However, many people find that raised flooring results in poor cable management due to difficulty of access, and that can lead to a situation where cables that are not in use are simply left in place. Therefore, some thought should be given to how the raised flooring is used and what kinds of access features it has. While you still find raised flooring in the smaller data centers and in control rooms, there's been a trend in data centers to avoid putting in raised flooring. Larger data centers tend to avoid that additional complexity and instead invest their time and effort in more robust heat dissipation and electrical facilities.

When using a raised floor, it is important to remove any under-the-floor obstructions as well as to seal the floor below the subfloor; doing this improves the flow of cool air to the hot-running systems above the floor. Because you are counting on the floor to provide airflow, all open or missing floor tiles should always be replaced, and any cable cutouts should either be sealed or replaced. Cable cutouts are a major source of air leak. You want the cold air to flow up and through your systems, so flooring in aisles should be closed tiles, and that under the equipment should be open.

With server racks that are large and heavy, you need to be concerned with the weight-bearing capacity of the raised flooring. The load-bearing capacity of a raised floor today, using cast aluminum tiles, is rated at more than 1,500 pounds, even when you use tiles that are 55% perforated. If you use metal flooring, you should ensure that the floor is nonconductive. Also, you should avoid using as flooring any material that traps dust and dirt or that creates particles. You wouldn't want to use carpet-covered panels, for example, even though carpet would provide the electrical isolation you might want.

If you plan on deploying a set of large server racks, you might find that you require a room height of nearly 10 feet in order to accommodate the floor, server, and ceiling fixtures.

Another issue associated with raised flooring is that it doesn't permit the use of locked cages in large data centers. Many ISP and collocation facilities rent out capacity in a data center and use cages or wire fencing to separate the different areas of servers in the data center. In such a case, raised flooring doesn't work.

Although raised floors continue to be deployed in server rooms, it seems that fewer and fewer data centers are using them as time goes by because they do have a number of drawbacks.

Planning for HVAC and Airflow

As server farms stuff more and more blades into bigger and bigger racks, the problem with heat dissipation has raised its ugly head once more, with a vengeance. High-density blade servers can generate 10kW to 20kW per large rack, but the kinds of cooling solutions that use to be delivered through raised floor plenum typically was designed to provide at most 3kW per rack.

As warm air rises and cold air falls, putting adequate cooling at the bottom of a room and exhaust at the top of the room should handle the heat load. Cooling equipment, like power equipment, should be redundant. You don't want your servers to burn up because an HVAC unit has failed. Most data centers invest in larger cooling capacity and UPS equipment rather than having redundant cooling.

To calculate the amount of air-conditioning required for a server rack, you need to figure that every kW of power used will generate slightly more than 3,400BTU/hr, or 0.28 ton of refrigeration, referred to as a tonref. Most air-conditioners are specified in the United States by the BTU/hour rating. Thus a server rack of 36 1U servers, each consuming about 0.2kW of power, requires the dissipation of 7.2kW, or 24,480BTU, per hour.

It probably makes sense to maintain your HVAC equipment on its own power circuit and not on the same circuit as your servers. HVAC equipment tends to be more tolerant of voltage fluctuations than server equipment. If you are trying to determine the amount of refrigeration and the power needed to run that equipment, you might want to go to the American Society of Heating, Refrigerating, and Air-Conditioning Engineers (ASHRAE) website, at www.ashrae.org, where you can find information on how to estimate data center and server room environmental requirements. Most data centers run at temperatures ranging from 67°F to 73°F (19°C to 23°C), and the recommended humidity is between 40% and 55%.

A rack that accommodates 36 1U servers is considered to be a medium to larger rack size in a data center today, but server racks that are coming to market will consume as much as 20kW. Typically the average rack-based server system consumes a little less than 3kW. There's no getting around the numbers: To keep the temperature stable, each 1kW of power consumed requires 1kW of cooling to remove excess heat.

How you distribute your servers can also affect how well your HVAC system works. If you put all your servers together side-by-side, you can overwhelm your cooling system's ability to cool that area. In general, you should distribute servers throughout your server room. In a fully loaded server room, where you may not have the luxury of ideal spacing, you should distribute all your hottest-running racks as best you can.

Even more important from the standpoint of airflow is to design your server rack distribution scheme so that hot and cool aisles alternate with one another. What this means is that because racks tend to exhaust air from the back of the server, a hot aisle is one where the servers on both sides of the aisle have the rear of their racks facing the aisle. A cool aisle would be one where the fronts of both servers face the aisle. Aside from having the benefit of better airflow, this arrangement gives you the benefit of not requiring people to be in hot aisles most of the time. Because servers' controls are typically in the fronts of the units, most of the time personnel work on the servers in the server room will be spent in cool aisles.

Figure 17.4 illustrates a hot aisle/cool aisle arrangement where cooling is distributed to improve airflow and heat dissipation. Even with a scheme such as the one shown here, when you have adequate HVAC in place, you still may not have enough airflow to make the system work properly. To improve airflow, you can place air distribution units to move additional air from location to location. These units should be protected from power loss in the same way as your other HVAC components.

Figure 17.4. This airflow design uses the cool aisle/hot aisle approach.

Note

Many server personnel think that having one aisle hot and the next one cool is a design defect. Therefore, they use more cooling vents than they should in hot aisles and more hot returns in cool aisles, thus defeating what is intended to be a design feature. It is important to make clear to personnel in a server room that the hot aisles were designed to be hot and should be left that way.

The ultimate solution to these hot-running systems is to implement direct cooling to them. Cool air should flow into the server racks at the bottom and be directed out at the top of the rack. Because unused vertical space in a rack can provide a way for hot air to recycle within a cabinet, it is important to install blanking panels and make sure that your cable management doesn't interfere with cooling.

When you design an HVAC system, keep in mind that the further you move away from the air intake into the room, the lower the air pressure. The forced air flowing into a server rack close to your HVAC system is therefore stronger than the air flowing into a server rack a few feet away, which in turn is stronger than the air pressure in a server rack further down the line. You can improve the situation by using one of these solutions:

  • In flooring near the HVAC system, use panels with a smaller number of holes than in the panels further away.

  • Use smaller-diameter pipes closer to the HVAC system than further away.

It's best to measure the airflow directly and, if possible, have some form of active flow control. Keep in mind that airflow changes over the course of a day, as the temperature changes and as a function of other factors as well.

Figure 17.5 illustrates APC's (www.apc.com) InfraStruXure design for a 20-server rack system. Notice that cables run along the top of the servers, backs of servers face backs of servers, and a built-in system of air handling vents out the top pipe. APC sells racks and has a rack configuration that lets you specify systems such as this. In addition, you can buy fans to place at the bottom of racks, fans that slide into a rack just as any other component would, fan units that run the whole vertical height of a rack (in the back) with intake at the bottom and exhaust at the top, and completely autonomous rack systems that integrate full cooling, electrics, and other components completely into the rack design.

Figure 17.5. APC's InfraStruXure solution with integrated cooling is shown here in a 20-server rack configuration.

Image courtesy of APC.

You should measure your HVAC system's performance from time to time to determine whether it is keeping up. Among the places you should measure the temperature are at your HVAC system's cooled input, at the exhaust of the return air, at several places in hot and cool aisles, and at various heights. If possible, it is best to set up an automated monitoring system. It's also important to institute a regular maintenance scheme for changing filters, checking coolant, recalibrating thermometers, and doing other servicing.

Building in Safety

You can't always prevent disasters, but you can plan for them. When you plan for a range of disasters that could conceivably occur, there are two basic tasks you need to do. First, you need to train your staff on what to do when a specific situation occurs. Second, you need to have appropriate safety equipment on hand to mitigate the problem.

These are some of the problems you might encounter:

  • Fires

  • Flooding

  • Earthquakes

  • Hazardous fumes (usually as a result of fire)

  • Winds due to hurricanes or tornados

  • Excessive noise

  • Building failures

It's possible to imagine all sorts of disasters, but the problem encountered most frequently is fire. Fires start due to electrical equipment troubles, mechanical failures, and all sorts of other problems. So it is the one hazard you should especially take special care to protect against. You should consider the following safety features as part of your server room deployment:

  • Active and automated smoke alarms that conform to the National Fire Protection Association (NFPA) 72E standard for automated fire detection systems. The NFPA 72E standard is an ANSI standard for the application of protective fire alarms and signaling equipment. To get a copy of the standard, go to www.techstreet.com/cgi-bin/detail?product_id=1034981&sid=goog.

  • A fire suppression system in the form of a halon release system that is heat activated. (Water sprinkler systems are not the best solutions in rooms full of expensive and highly charged electrical systems.) You should also place portable fire extinguishers in appropriate locations throughout the room.

    The recommended system for fire suppression is an FM200 heptafluoropropane gas dispersion system. This system uses the coolant gas to cool all hot materials and douse the fire. FM200 is used because it doesn't damage hardware, doesn't require cleanup, and has been shown to be safe for personnel in the area when it is discharged. FM200 replaces the older Halon 1301 systems that have been shown to use ozone depleting gases.

Caution

In order to use some fire extinguishers safely, you need to think about special considerations and equipment. For example, although halon is not toxic, when it is used, it displaces the air in the area and makes it difficult to breathe. Therefore, you need a breathing system in order to use halon fire extinguishers. Other fire suppressants have the same issue (CO2, for example), but not to the same extent. The FM200 halon system is recommended in a data center.

  • Breathing systems for use when hazardous fumes are present.

  • Fire blanket protectors.

  • A well-maintained emergency response system and a fire response plan that is part of a general disaster response plan. Your server room or data center should have an easily understood evacuation plan.

A server room or data center is not a good place to keep combustible materials. Large piles of paper, chemicals, storage packaging materials, and other combustibles should be removed from the area. Smoking should be forbidden both from a fire prevention standpoint and from a particulate contamination standpoint.

Safety also means that all your systems should be inspected and periodically maintained according to manufacturers' specifications. One area where fires often begin is in HVAC systems, when dust has collected in areas where the system is heated, such as reheating coils.

Planning for Physical Security and Access

The best security software available can't defeat an insider who can gain physical access to a system. Even if a person can't log in to the system, physical access makes it possible to damage systems or to remove data in its physical form for later access. You can't eliminate improper use of equipment by authorized personnel, and you also can't eliminate many accidents, but you can lower the risk of non-authorized or inexperienced users getting access to facilities that they are not supposed to have access to. Given that more than 60% of downtime is attributed to operator error, anything you can do to lower the risk is worth considering.

The first line of defense in a physical security scheme is to require users to reliably identify themselves. There are a variety of ways to do this, including using fingerprints, hand scans, eye scans, facial scans, and physical keys such as smart cards. These systems identify the user requesting access, but they do not identify why the user is trying to access the room.

The second level of physical security is to define the perimeter of the area that needs to be protected. The following areas can be security boundaries:

  • The entire location

  • The building

  • The floor or portion of the building that contains sensitive equipment

  • Any waiting areas

  • The computer room

  • The utility room

  • Individual computer systems

Because physical security of a server room also includes the facilities necessary to make your systems work, access to UPS and power systems, as well as HVAC systems, may also need to be protected. When designing security systems, many consultants try to establish a set of concentric security perimeters, with increasingly secure methods of validating a person's access as he or she moves to inner perimeters. Thus while a maintenance person might have access to offices and common areas, only personnel with greater clearance, such as vetted maintenance personnel, IT staff, and management, would have access to the data center area. To gain access to the actual server room, the number of personnel would be even more limited, to many fewer people, eliminating most if not all maintenance staff, most of the IT staff, and probably most if not all of the management staff, with the exception of high-level IT managers.

In selecting identification and access systems, you should think in terms of functionality. You could broadly categorize devices into the following types:

  • Physical token In locking down a facility, the least reliable ID devices are the ones that identify a person by some physical thing that the person carries, be it a card, a key, or some other device. Those items, often called tokens, can be stolen or lost, and systems that rely on those devices don't know the difference between the correct individual and someone else using that person's token.

    Among the token technologies in use are magnetic-strip cards, barium ferrite cards (which have small magnetic fields), Weigand cards (with coded magnetic stripes), bar code cards, infrared shadow bar code cards, proximity (or "prox") cards, and (the ultimate) smart cards along with smart card readers.

  • Information access A piece of information that you know and that no one else has access to, such as a computer password, a code, or a method for opening a lock or physically verifying the ID of a card is more reliable than a token in that it cannot usually be stolen. However, simple passwords can be cracked, using password tables and brute force. This level of access can also be breached if an ID is shared with someone else or discovered on a written reminder. Still, these methods are better than physical tokens.

    Information access devices include keypads and coded locks on which a personal access code or personal ID number is entered.

    Note

    It is possible to purchase computer software and DVDs that contain several million passwords, systematically constructed to include all letter and symbol combinations. These programs, called crackers, start by testing single-character passwords, two-character passwords, three-character passwords, and so on. The longer the password, the harder it is to crack. Some of these programs test for lowercase letters only before adding in uppercase letters and then symbols.

    Using this type of software is a brute-force approach, but it can be amazingly quick. An eight-character password that consists of lowercase letters only can usually be cracked in a few minutes, using a standard desktop PC. When you mix in uppercase letters, the amount of time rises, but not substantially. It isn't until you have eight letters with a combination of upper- and lowercase letters as well as ASCII symbols that the time to crack an eight-character code rises to the point where most people would give up.

  • Personal identification The most secure method of access is by something that is unique to an individual and only that individual. The ultimate method is a DNA sequence, but that technology is off in the future somewhere. In today's market, it is possible to purchase fingerprint ID systems and systems that identify iris and retina patterns (eye color and eye blood vessel distribution, respectively), hand shape, facial geometry, and voice matching. All these are unique or nearly unique to one particular individual. Some people put handwriting analysis on the list; this type of security is not concerned with the actual letters being written but with the pattern of the motion of the pen. However, with careful study, a person can learn to imitate someone else's handwriting patterns.

    Most personal ID equipment falls into the category of biometric devices. Although this technology has a high rate of accuracy, it is not perfect technology. Biometric devices can take an unacceptably long time to return a match and can have failures due to false acceptance or false rejection.

All the aforementioned devices can authenticate a user but do nothing to protect against a second person getting access by following closely behind an authenticated user, called piggybacking or tailgating. To prevent this type of access, you may need to use entry doors that physically allow only one person to pass through at a time. Another way to monitor this type of entry is through camera surveillance or the use of a guard.

Категории