Quintero - Deploying Linux on IBM E-Server Pseries Clusters

 <  Day Day Up  >  

A GPFS cluster requires an underlying RSCT peer domain. In this section, we describe the commands we used to create an RSCT peer domain on our experimental setup.

There are currently two types of domains in RSCT:

  • A management domain, with one management server and managed nodes. This is typically a CSM domain. The RSCT communication in this model flows between the management server and the nodes; it never flows between the nodes themselves .

  • A peer domain, where all nodes have the same role as far as RSCT is concerned . The information about the domain is available to all participating nodes through the use of Topology Services and Group Services, which are started upon creation of the peer domain.

Note that we do not cover every aspect of peer domains here. For a complete description of RSCT, refer to the RSCT documentation:

http://www.ibm.com/servers/eserver/pseries/library/clusters/rsct.html

Chapter 2 of the Administration Guide contains information about creating and administering peer domains. Section 4.3 in the IBM Redbook A Practical Guide for Resource Monitoring and Control (RMC) , SG24-6615, contains an example of a peer domain setup.

Our experimental setup for GPFS, and the underlying RSCT peer domain, consists of two nodes connected by a 10/100 Mbit Ethernet and a Gigabit Ethernet. One node will be primarily for control messages, and the second node will be used for GPFS communication.

Table 6-1 summarizes the names and interfaces of our nodes.

Table 6-1. Network interfaces and corresponding names

Node

eth0: 10/100 Mbit

eth1: 1000 Mbit

node1

r01n33

gr01n33

node2

r01n34

gr01n34

6.2.1 RSCT requisites

In our testing, we used RSCT version 2.3.2-1. These filesets must be installed on the nodes prior to defining the peer domains:

  • src-1.2.0

  • rsct. core . utils -2.3.2

  • rsct.basic-2.3.2

  • rsct.64bit-2.3.2

  • rsct.core-2.3.2

6.2.2 Overview of peer domain operations

RSCT provides commands for adding, removing, starting, and stopping nodes inside a peer domain and commands to create, start, stop, and remove peer domains. Commands acting on nodes end in rpnode . Commands for domains end in rpdomain .

6.2.3 Creating a peer domain

In case the GPFS nodes are part of a CSM cluster, you must tell RSCT that the commands need to operate on a peer domain. To do this, issue the command shown in Example 6-1.

Example 6-1. Setting the RSCT management scope

lpar1:~ # export CT_MANAGEMENT_SCOPE=2

Prepare the nodes

Before being incorporated in a peer domain, the security environment has to be prepared for all the nodes. During this, the nodes will exchange public keys. This is done with the preprpnode command, as shown in Example 6-2.

As an argument to the preprpnode command, a list of all the nodes from which this node can accept RSCT messages from. In a GPFS cluster, this command needs to be run on all the participating nodes.

Example 6-2. preprpnode command

r01n33:~ # preprpnode -V r01n33 r01n34 Beginning to prepare nodes to add to the peer domain. Completed preparing nodes to add to the peer domain.

Create the peer domain

Once the nodes are prepared we can proceed to creating the domain. This is done with the mkrpdomain command. A peer domain is characterized by a name , the names of the constitutive nodes, and the UDP ports that Topology Services and Group Services will use. The command shown in Example 6-3 creates the peer domain named itso . We run this command once on one of the nodes.

Example 6-3. Create a peer domain

r01n33:~ # mkrpdomain -V itso r01n33 r01n34 Making the peer domain "itso". Completed making the peer domain "itso".

The command lsrpdomain shows the status of the peer domain we have just created. Example 6-4 shows as "Offline" because we have not yet started it.

Example 6-4. Check the status of the peer domain

r01n33:~ # lsrpdomain Name OpState RSCTActiveVersion MixedVersions TSPort GSPort itso Offline 2.3.2.0 No 12347 12348

6.2.4 Starting and stopping a peer domain

To bring the peer domain online, we use the startrpdomain command as shown in Example 6-5 on page 288. This command can be run on any node in the domain, but it must be issued on only one node.

Immediately after, we check the domain status with lsrpdomain . It can take a while, up to one minute, for the domain to come up. The status should change from "Pending online" to "Online".

Example 6-5. Start the peer domain and watch it come online

r01n33:~ # startrpdomain itso r01n33:~ # lsrpdomain Name OpState RSCTActiveVersion MixedVersions TSPort GSPort itso Pending online 2.3.2.0 No 12347 12348 After a while ... r01n33:~ # lsrpdomain Name OpState RSCTActiveVersion MixedVersions TSPort GSPort itso Online 2.3.2.0 No 12347 12348

Upon startup of the peer domain, the Topology Services and Group Services subsystems will be started on the nodes. To check the proper operation of these services, use the lssrc command as shown in Example 6-6.

The lssrc command is part of System Resource Controller (SRC), which has been ported from AIX to Linux. It is a way to control subsystems. Subsystems must register to SRC in order to be controlled.

Example 6-6. Use lssrc to query Topology/Group Services

r01n33:~ # lssrc -a Subsystem Group PID Status IBM.ConfigRM rsct_rm 781 active ctcas rsct 4376 active cthats cthats 10839 active cthags cthags 10843 active ctrmc rsct 10933 active IBM.ERRM rsct_rm 10947 active IBM.AuditRM rsct_rm 10968 active IBM.HostRM rsct_rm inoperative

The status of individual nodes can be displayed with the lsrpnode command, as shown in Example 6-7.

Example 6-7. lsrpnode to check each node

r01n33:~ # lsrpnode Name OpState RSCTVersion gr01n34 Online 2.3.2.0 gr01n33 Online 2.3.2.0

To stop a peer domain, the nodes can be stopped individually with the stoprpnode command as shown in Example 6-8. We use the lsrpnode to verify that the node was stopped correctly.

Example 6-8. Stopping a node in the peer domain

r01n33:~ # stoprpnode gr01n34 r01n33:~ # lsrpnode Name OpState RSCTVersion gr01n34 Offline 2.3.2.0 gr01n33 Online 2.3.2.0

We can use the stoprpdomain command to stop all nodes at once. This is shown in Example 6-9.

Example 6-9. Stopping the whole peer domain

r01n33:~ # lsrpnode Name OpState RSCTVersion gr01n34 Online 2.3.2.0 gr01n33 Online 2.3.2.0 r01n33:~ # stoprpdomain itso r01n33:~ # lsrpnode lsrpnode: There are no nodes in the peer domain or an online peer domain does not exist.

6.2.5 Removing a peer domain

To remove the peer domain, you must bring it online first and then use the rmrpdomain command; an example is shown in Example 6-10.

For some unknown reason, the peer domain may survive. To remove it permanently, use the -f (force option).

Example 6-10. Removing a peer domain

r01n33:~ # rmrpdomain itso rmrpdomain: There are no nodes in the peer domain or an online peer domain does not exist. r01n33:~ # rmrpdomain -f itso

 <  Day Day Up  >  

Категории