Linux Clustering With Csm and Gpfs

 < Day Day Up > 


8.1 Adding and removing disks from GPFS

Unlike many traditional file systems, GPFS allows disks to be added and removed from the file system, even when it is mounted. In this section, we outline the procedures for working with disks.

8.1.1 Adding a new disk to an existing GPFS file system

For this task, we will use a command that has not been discussed in this redbook before: mmadddisk. This command adds a new disk to a file system and optionally re-balances data onto the new disk.

Before adding the physical disk to the file system it must first be defined as an NSD. We will use mmcrnsd for this, as in 7.8.1, "GPFS nodeset with NSD network attached servers" on page 213.

In Example 8-1, we create an NSD using the second disk from node001.

Example 8-1: Creating an additional NSD with mmcrnsd

[root@storage001 root]# cat > newdisk.dsc /dev/sdb1:node001-myri0.cluster.com::dataAndMetadata:-1 ^D [root@storage001 root]# mmcrnsd -F newdisk.dsc mmcrnsd: Propagating the changes to all affected nodes. This is an asynchronous process. [root@storage001 root]#

If the disk (/dev/sdb1 in this case) contains an NSD descriptor, the mmcrnsd command will fail. You can verify that the disk is not currently being used as an NSD by issuing the mmlsnsd -m command. After that, you can define the new disk by adding the -v no option to the mmcrnsd command line to disable disk verification.

Example 8-2 shows the use of the mmlsnsd command to verify that the NSD was created correctly. The newly created NSD should show up as (free disk), since it has not yet been assigned to a file system.

Example 8-2: Verifying correct NSD creation with mmlsnsd

[root@storage001 root]# mmlsnsd File system NSD name Primary node Backup node --------------------------------------------------------------------------- gpfs0 gpfs2nsd storage001-myri0.cluster.com (free disk) gpfs3nsd node001-myri0.cluster.com [root@storage001 root]#

Once the NSD has been successfully defined, we can use it to enlarge our GPFS file system with mmaddisk. Because GPFS paralyzes read and write operations, simply appending the disk to the file system is inefficient. Data will not be balanced across all the disks, so GPFS will be unable to make optimal use of the new disk. The mmadddisk command can automatically re-balance the data across all the disks through the use of the -r switch. In large file systems, the re-balance can take a long time, so we also supply the asynchronous switch (-a). This will cause the mmadddisk command to return while the re-balance continues in the background.

Note 

Although you can still access the file system while it is being re-balanced, certain GPFS metadata commands, including mmdf, cannot be run until the re-balance has completed.

Example 8-3 shows the output of the mmadddisk command.

Example 8-3: Adding a disk to a GPFS file system with mmadddisk

[root@storage001 root]# mmadddisk gpfs0 -F newdisk.dsc -r -a GPFS: 6027-531 The following disks of gpfs0 will be formatted on node storage001.cluster.com: gpfs3nsd: size 17767858 KB Extending Allocation Map GPFS: 6027-1503 Completed adding disks to file system gpfs0. mmadddisk: Propagating the changes to all affected nodes. This is an asynchronous process. [root@storage001 root]#

Tip 

Although disks can be re-balanced while the file system is running, it will affect the performance. You might want to think about adding the new disk(s) during a period of low activity, or re-balancing the disks at a later time with the mmrestripefs -b command.

Using the mmlsdisk and mmlsnsd command, you can verify that the NSD is now a member of our GPFS file system, as in Example 8-4.

Example 8-4: Verifying the new NSD was added with mmlsdisk and mmlsnsd

[root@storage001 root]# mmlsdisk gpfs0 disk driver sector failure holds holds name type size group metadata data status availability ------------ -------- ------ ------- -------- ----- ------------- ------------ gpfs2nsd nsd 512 -1 yes yes ready up gpfs3nsd nsd 512 -1 yes yes ready up [root@storage001 root]# mmlsnsd File system NSD name Primary node Backup node --------------------------------------------------------------------------- gpfs0 gpfs2nsd storage001-myri0.cluster.com gpfs0 gpfs3nsd node001-myri0.cluster.com [root@storage001 root]#

You can also verify the capacity of your file system using mmdf, as shown in Example 8-5. Remember that mmdf will not consider replication factors; if you are using full replication, you will need to divide the file system size and free space by two.

Example 8-5: Inspecting file system capacity with mmdf

[root@storage001 root]# mmdf gpfs0 disk disk size failure holds holds free KB free KB name in KB group metadata data in full blocks in fragments --------------- --------- -------- -------- ----- --------------- --------------- gpfs2nsd 106518944 -1 yes yes 106454016 (100%) 1112 ( 0%) gpfs3nsd 17767856 -1 yes yes 17733632 (100%) 656 ( 0%) --------- -------------- -------------- (total) 124286800 124187648 (100%) 1768 ( 0%) Inode Information ------------------ Total number of inodes: 104448 Total number of free inodes: 104431 [root@storage001 root]#

8.1.2 Deleting a disk in an active GPFS file system

Although this sounds like a scary thing to do, it is actually perfectly safe; GPFS handles this easily when the system utilization is low. Under load, it may take a significant amount of time.

Removal is accomplished with the GPFS mmdeldisk command, passing the file system name and disk (NSD) you want to delete. As with mmadddisk, you can also specify that the file system should be re-striped by using the -r and -a options to perform the re-stripe in the background.

Important: 

The disk to be deleted by the mmdeldisk command must be up and running for this command to succeed; you can verify this by using the mmlsdisk command. If you need to delete a damaged disk, you must use the -p option so it can delete a stopped disk.

Example 8-6 shows the removal of gpfs3nsd that we just added to our file system.

Example 8-6: Removing a disk from GPFS with mmdeldisk

[root@storage001 root]# mmdeldisk gpfs0 gpfs3nsd -r -a Deleting disks ... GPFS: 6027-589 Scanning file system metadata, phase 1 ... 31 % complete on Thu Nov 24 17:12:55 2002 62 % complete on Thu Nov 24 17:12:58 2002 93 % complete on Thu Nov 24 17:13:01 2002 100 % complete on Thu Nov 24 17:13:02 2002 GPFS: 6027-552 Scan completed successfully. GPFS: 6027-589 Scanning file system metadata, phase 2 ... GPFS: 6027-552 Scan completed successfully. GPFS: 6027-589 Scanning file system metadata, phase 3 ... GPFS: 6027-552 Scan completed successfully. GPFS: 6027-565 Scanning user file metadata ... 84 % complete on Thu Nov 24 17:13:08 2002 100 % complete on Thu Nov 24 17:13:08 2002 GPFS: 6027-552 Scan completed successfully. GPFS: 6027-370 tsdeldisk completed. mmdeldisk: Propagating the changes to all affected nodes. This is an asynchronous process. [root@storage001 root]#

Again, we can use the mmlsdisk and mmlsnsd commands to verify the successful removal of the disks, as in Example 8-7.

Example 8-7: Verifying successful disk removal with mmlsdisk and mmlsnsd

[root@storage001 root]# mmlsdisk gpfs0 disk driver sector failure holds holds name type size group metadata data status availability ------------ -------- ------ ------- -------- ----- ------------- ------------ gpfs2nsd nsd 512 -1 yes yes ready up # mmlsnsd File system NSD name Primary node Backup node --------------------------------------------------------------------------- gpfs0 gpfs2nsd storage001-myri0.cluster.com (free disk) gpfs3nsd node001-myri0.cluster.com [root@storage001 root]#

Example 8-8 on page 230 shows how we could also have used mmlsnsd -F to show only free NSDs in our nodeset.

Example 8-8: Listing free NSDs with mmlsnsd -F

[root@storage001 root]# mmlsnsd -F File system NSD name Primary node Backup node --------------------------------------------------------------------------- (free disk) gpfs3nsd node001-myri0.cluster.com [root@storage001 root]#

8.1.3 Replacing a failing disk in an existing GPFS file system

GPFS allows for a failing disk to be replaced while the file system is up and running by using the mmrpldisk command. Although replacing with different size disks is supported, complications can arise and it should be avoided whenever possible. It is further recommended that you do not change the disk usage (data/metadata) or failure group if possible.

Important: 

You cannot use the mmrpldisk command to replace disks that have actually failed; they should be removed with the mmdeldisk -p command. Verify disks are available and up with the mmlsdisk command before attempting this procedure.

As when adding a new disk, the replacement disk must first be defined as an NSD. Example 8-9 shows the definition of the disk with the mmcrnsd command.

Example 8-9: Defining a replacement disk with mmcrnsd

[root@storage001 tmp]# cat > rpldisk.dsc /dev/sdb1:node002-myri0.cluster.com::dataAndMetadata:-1 ^D [root@storage001 tmp]# mmcrnsd -F rpldisk.dsc -v no mmcrnsd: Propagating the changes to all affected nodes. This is an asynchronous process. [root@storage001 tmp]#

Now we can run, as shown in Example 8-10, the mmrpldisk command to actually perform the replacement. We replace failing disk gpfs3nsd with the newly created NSD.

Example 8-10: Replacing a disk with mmrpldisk

[root@storage001 tmp]# mmrpldisk gpfs0 gpfs3nsd -F rpldisk.dsc Replacing gpfs3nsd ... GPFS: 6027-531 The following disks of gpfs0 will be formatted on node storage001.cluster.com: gpfs5nsd: size 17767858 KB Extending Allocation Map GPFS: 6027-1503 Completed adding disks to file system gpfs0. GPFS: 6027-589 Scanning file system metadata, phase 1 ... 66 % complete on Thu Nov 24 17:42:44 2002 100 % complete on Thu Nov 24 17:42:45 2002 GPFS: 6027-552 Scan completed successfully. GPFS: 6027-589 Scanning file system metadata, phase 2 ... GPFS: 6027-552 Scan completed successfully. GPFS: 6027-589 Scanning file system metadata, phase 3 ... GPFS: 6027-552 Scan completed successfully. GPFS: 6027-565 Scanning user file metadata ... GPFS: 6027-552 Scan completed successfully. Done mmrpldisk: Propagating the changes to all affected nodes. This is an asynchronous process. [root@storage001 tmp]#

Note 

If you are replacing the failed disk with an identical disk (size, usage, and failure group), no re-balance is required. Otherwise, you may want to run use the mmrestripefs -b command sometime when system is not overloaded.

We can now verify that the failing disk has been removed from the file system, as shown in Example 8-11.

Example 8-11: Using mmlsdisk and mmlsnsd to ensure a disk has been replaced

[root@storage001 tmp]# mmlsdisk gpfs0 disk driver sector failure holds holds name type size group metadata data status availability ------------ -------- ------ ------- -------- ----- ------------- ------------ gpfs2nsd nsd 512 -1 yes yes ready up gpfs4nsd nsd 512 -1 yes yes ready up # mmlsnsd -F File system NSD name Primary node Backup node --------------------------------------------------------------------------- (free disk) gpfs3nsd node001-myri0.cluster.com [root@storage001 tmp]#


 < Day Day Up > 

Категории