Linux Clustering With Csm and Gpfs
| < Day Day Up > |
|
1.7 Other cluster hardware components
Aside from the physical nodes (management, compute, and storage nodes) that make up a cluster, there are several other key components that must also be considered. The following subsections discuss some of these components.
1.7.1 Networks
Nodes within a cluster have many reasons to communicate; after all, they are cooperating to provide an overall result. Some of the key communications that must occur include:
-
Interprocess communication, where there is almost always a need for communication between processes to coordinate processing and handle concurrent access to shared resources
-
Management operations.
-
Software installation
-
Storage access
Depending on the actual application and the performance requirements, these various communications should often be carried out on different networks. Thus, you typically have more than one network and network type linking the nodes in your cluster.
Common network types used in clusters are:
-
Fast Ethernet and/or Gigabit Ethernet
Included to provide the necessary node-to-node communication. Basically, we need two types of LANs (or VLANs): one for management and another for applications. They are called management VLAN and cluster VLAN, respectively. Another LAN (or VLANs) may be also used to connect the cluster to the outside world (the enterprise corporate network, for example). This LAN is often called public VLAN.
-
Myrinet
Some clusters need high-speed network connections to allow cluster nodes to talk to each other as quickly as possible. The Myrinet network switch and adapters are designed specifically for this kind of high-speed and low-latency requirement.
More information about Myrinet can be found at the following Web site:
http://www.myri.com/
-
Management Processor Network
Each xSeries compute node (x330 or x335) can be equipped with a management processor (also known as a service processor) that allows remote node power on/off/reset capability, monitors node environmental conditions (such as fan speed, temperature, and power), and allows remote POST/BIOS console, power management, and SNMP alerts. The xSeries models typically used as management node (x342 and x345) must also include an optional Management Processor Adapter in order to provide the same functionality. For additional information and specifications on the Management Processor Adapter, refer to 2.2.3, "Remote Supervisor Adapters" on page 28.
1.7.2 Storage
Most clusters require that multiple nodes have concurrent access to storage devices, and often the same files and databases. Especially in high performance computing clusters, there is a need for very fast and reliable data access. Depending on your environment and the applications that are running on the cluster, you may chose a variety of storage options that provide both the performance and the reliability that you require.
There are two general storage configuration options: direct attached and network shared. Direct attached allows for each node to be directly attached to a set of storage devices, while network shared assumes the existence of one or more storage nodes that provide and control the access to the storage media.
We discuss some of the storage options in more detail in 2.2, "Hardware" on page 24.
1.7.3 Terminal servers
Terminal servers provide the capability to access each node in the cluster as if using a locally attached serial display. The BIOS of compute and storage nodes in xSeries clusters are capable of redirecting the machine POST out of the serial port. After POST, the boot loader and operating system also utilize the serial port for display and key input. The terminal servers provide cluster operators the ability to use common tools such as telnet, rsh, or ssh to access and communicate with each node in the cluster and, if necessary, multiple nodes simultaneously. Cluster management packages then have the ability to log whatever the nodes redirect out of the serial port to a file, even while not being viewed by the operator. This gives the operator an out-of-band method of viewing and interacting with the entire boot process of a node from POST to the operating system load. This is useful for debugging the boot process, performing other low-level diagnostics, and normal out-of-band console access. Terminal servers provide this capability by allowing a single display to be virtually attached to all nodes in the cluster. Terminal servers from Equinox and iTouch Communications, a division of MRV Communications, are examples of such devices and are commonly used in clusters.
More information on terminal servers from Equinox can be found at the following Web site:
http://www.equinox.com/
More information on terminal servers from MRV can be found at the following Web site:
http://www.mrv.com/product/MRV-IR-003/features/
1.7.4 Keyboard, video, and mouse switches
In addition to the terminal servers (for serially attached displays and printers), it is also not practical to provide a keyboard and a mouse for every node in a cluster. Therefore, using a common Keyboard Video Mouse (KVM) switch is also indispensable when building a cluster. This allows an operator to attach to any individual node and perform operations if required.
Keep in mind that this is for use by the cluster administrator or operator, not a general user of the cluster.
| < Day Day Up > |
|