Upgrading and Repairing Networks (5th Edition)
The Network File Systems (NFS) protocol consists of several protocols that perform specific functions. Sun Microsystems has published the specifications for NFS so that other vendors can easily implement these protocols to allow for remote mounting of file systems independent of the operating system of the computers. RFC 1094 defines the most widely used version of NFS (version 2). RFC 1813 documents version 3, which adds better support for wide area networking. If you think you will be involved in troubleshooting NFS on the network, you should find out on which version your NFS software is based and become familiar with these documents.
NFS is built on routines made up of remote procedure calls ( RPC ) . XDR is used as the data format so that data from different systems can be represented in a common format for interchange. In addition, the Mount protocol is used to make the initial connection to a remote file system. Because NFS is built in this layered fashion, and problems can occur at any level, you will need to understand not only how the NFS protocol functions, but also RPC, XDR, and the Mount protocol. Protocol Components: Remote Procedure Call (RPC) Protocol
RPC is a simple client/server protocol application. RPC defines the interaction between a client, which formats a request for execution by the server, and the server, which executes the client's request on the local system. The server performs whatever processing is required and returns the data and control of the procedure to the client. Sun developed RPC for use in NFS, but it has since been employed quite usefully by many other client/server-based products. The rpcbind daemon (a process that runs in the background waiting for requests ) runs on both the client and the server and is responsible for implementing RPC protocol exchanges between hosts on the network. A service is a group of RPC procedures that have been grouped together into programs. A unique number is used to identify each service, which means that more than one service can operate at any given time. An application that needs to use a service can use the different programs that make up the service to perform specific actions. For example, when designing an NFS service, one program might be responsible for determining a file's attributes, and another program might be responsible for the actual transfer of data between the client and server computers. The unique service number is used to identify different network services that run on a particular system, and the mapping for this is usually found in the file /etc/rpc . The RFC that defines RPC sets forth numbers used for many common services, and these are shown in Table 35.3. Table 35.3. Numbers Used to Identify RPC Services
The portmapper service (using port 111 for UDP or TCP) manages the port numbers used in TCP/IP communications. Because there can be more than one open connection between a client and a server, a port number is used to identify each connection. Don't confuse port numbers with the numbers assigned to services. Service numbers are used to identify a particular RPC service. Port numbers identify connections between two computers that use a service. External Data Representation (XDR)
A common format is used when exchanging data between computer systems that are running different operating systems. Some use ASCII code for text, whereas others use Unicode. Some use big-endian encoding techniques, whereas others use little-endian, which determines the order in which bytes are used to represent data (left to right or right to left). It is even more complicated when you look at how different computer systems represent numeric data in memory or storage. When using a multiple-byte value to represent a floating-point number, for example, you need to know which bits are used for the exponent and which are used for the mantissa. NFS uses the External Data Representation (XDR) standard for data exchange. The details of XDR are covered in RFC 1014. It is a C-like notation for representing data, not a programming language itself. An item, such as a character or numeric value, is represented in XDR by using 4 bytes (32 bits), with the lower bytes being the most significant. Other encoding features of XDR include the following:
XDR provides an extensible data description format that makes implementing NFS on multiple hardware and software platforms much easier. The NFS Protocol and Mount Protocol
The NFS protocol is a set of procedures (called primitives ) that are executed via RPC to allow an action to be performed on a remote computer. NFS is a stateless protocol, which means that the server does not have to maintain information about the state of each client. If the server (or the network) fails, the client needs only to repeat the operation. The server doesn't have to rebuild any data tables or other structures to recover the state of a client after a failure.
The RPC procedures that make up the NFS protocol are the following:
There is no provision in these procedures to open or close a file. Because NFS is a stateless protocol, it doesn't handle file opens or closes . The Mount protocol performs this function and returns a file handle to NFS. The mountd daemon runs on both the client and the server computer and is responsible for maintaining a list of current connections. Most implementations of NFS recover from client crashes by having the client send a message to the NFS server when it boots, telling it to unmount all its previous connections to the client. When compared to the NFS protocol, the Mount protocol consists of only a very few procedures:
Configuring NFS Servers and Clients
The biod daemon runs on the client system and communicates with the remote NFS server. The daemon also processes the data that is transferred between the NFS client and the NFS server. The RPC daemon must also be running, and either UDP or TCP needs to be running, depending on which one your version of NFS uses as a transport. Users can mount a file system offered by an NFS server, provided that they are not prevented from mounting the file system by the server, by using the mount command.
NFS Client Daemons
On the client side of the NFS process, there are actually three daemon processes that are used. The first is biod , which stands for block input/output daemon. This daemon processes the input/output with the NFS server on behalf of the user process that is making requests of the remote file system. If you use NFS heavily on a client, you can improve performance by starting up more than one biod daemon. The syntax used to start the daemon is as follows : /etc/biod [ number of daemon processes ] This daemon is usually started in the /etc/rc.local startup file. Modify this file if you want to permanently change the number of daemons running on the client system. You can first test by executing the command online to determine how many daemons you need to start and then place the necessary commands in the startup file. When deciding performance issues, remember that on a heavily loaded client, making a change in one place might result in poorer performance from another part of the system. So don't assume that you need a lot of extra daemons running unless you can first show that they are needed and do improve performance. Each daemon process is like any other process running on the system, and it uses up system resources, especially memory. Begin by using one or two daemons if you are using a workstation dedicated to one user. For a multiple-user computer, test your performance by increasing the number of daemons until NFS performance is satisfactory (all the time checking, of course, other performance indicators to be sure that the overall system impact is justified). Although having multiple daemons means that NFS requests can be processed in parallel, remember that the network itself might be a bottleneck. Additional biod daemons will not increase throughput when the network itself is the limiting factor. Also note that the biod daemon is a client process. You should not run it on an NFS server unless that server is also a client of another NFS server. In addition to the biod daemon, the lockd and statd daemons also run on the client. For more information on these, see the section "Server-Side Daemons," later in this chapter. The mount Command
The mount command is used to mount a local file system, and you can also use the command to mount a remote NFS file system. The syntax for using mount to make available a file system being exported by an NFS server is as follows: mount -F nfs -o options machine:filesystem mountpoint In some versions of Unix, the syntax for mounting a remote NFS file system is a little different. For example, in SCO Unix you use a lowercase f and an uppercase NFS : mount -F nfs -o options machine:filesystem mountpoint In BSD Unix, there is a command called mountnfs , which uses the system call mount to perform most of its functions. This version of the mount command comes with a lot of additional parameters, including the capability to specify on the mount command line whether to use UPD or TCP as the underlying transport mechanism. The value you supply for machine:filesystem should be the hostname of the remote server that is exporting the file system you want to mount for machine . Substitute the name of the file system for filesystem . The following example causes the remote file system on host zira , called /usr/projectx/docs , to be made accessible in the local file system hierarchy at the /usr/docs directory: mount -F nfs -o ro zira:usr/projectx/docs /usr/docs This is the same way you mount other local file systems into the local hierarchy. Under the /usr/docs directory, you can access any other subdirectories that exist on host zira under the /usr/projectx/docs directory. The -o parameter can be used to specify options for the mount command. In the preceding example, the letters ro for the option were used to make the remote file system read-only by users on the local computer. Other options that can be used when mounting a remote file system include the following:
For more command-line parameters and options, see the man page for the mount command for your particular system.
The mountpoint is the path to the location in the local file system where the remote NFS file system will appear, and this path must exist before the mount command is issued. Any files existing in the mountpoint directory will no longer be accessible to users after a remote file system is attached to the directory with the mount command, so do not use just any directory. Note that the files are not lost. They reappear when the remote file system is unmounted. Using the fstab File to Mount File Systems at Boot Time
When you have file systems that need to be remounted each time the system reboots, you can use the file /etc/fstab to do this. This file is also used to mount local file systems, so be careful when making edits. The format for a record is as follows: filesystem directoryname type options frequency pass The filesystem field for a record used to mount a remote file system includes the server hostname and the pathname of the remote file system separated by a colon ( hostname:path ). The second field, directoryname , is the path for the mountpoint on the local system, which indicates where the remote system is mounted and made available for access. The next field, type , is used to specify the file-system type, which can be any of the following:
The options field is used for a comma-delimited list of mounting options ( rw , ro , and so on). The frequency is used in determining when a file system will be "dumped" for backup purposes. This can usually be set to zero for NFS systems mounted on a client because it is usually the NFS server that is responsible for making backups of local data. The final field, pass , can also be set to zero most of the time for an NFS file system mounted on a client. This field is used by the fsck utility to determine on which pass it is to check this file system.
Server-Side Daemons
The nfsd daemon process handles requests from NFS clients for the server. The nfsd daemon interprets requests and sends them to the I/O system to perform the requests' actual functions. The daemon communicates with the biod daemon on the client, processing requests and returning data to the requestor 's daemon. An NFS server will usually be set up to serve multiple clients. You can set up multiple copies of the nfsd daemon on the server so that the server can handle multiple client requests in a timely manner. The syntax for the command to start the daemon is as follows: /etc/nfsd [ number of nfs daemons to start ] For example, to start up five copies of the nfsd daemon at boot time, modify your startup scripts to include the following command: /etc/nfsd 5 Unix systems and the utilities that are closely associated with them are continually being updated or improved. Some new versions include using the concept of threads to make it possible for a daemon to be implemented as a multithreaded process, capable of handling many requests at one time. Digital Unix 4.0 (now HP True64 Unix) is an operating system that provides a multithreaded NFS server daemon. Other daemons the NFS server runs include the lockd daemon to handle file locking and the statd daemon to help coordinate the status of current file locks. Configuring Server Daemons
For an NFS server, choose a computer that has the hardware capabilities needed to support your network clients. If the NFS server will be used to allow clients to view seldom-used documentation, a less-powerful hardware configuration might be all you need. If the server is going to be used to export a large number of directories, say from a powerful disk storage subsystem, the hardware requirements become much more important. You will have to make capacity judgments concerning the CPU power, disk subsystems, and network adapter card performance. Setting up an NFS server is a simple task. Create a list of the directories that are to be exported, and place entries for these in the /etc/exports file on the server. At boot time the exportfs program starts and obtains information from this file. The exportfs program uses this data to make exported directories available to clients that make requests. Sharing File Systems: The exportfs Command
At system boot time, the exportfs program is usually started by the /sbin/init.d/nfs.server script file, but this can vary, depending on the particular implementation of Unix you are using. The exportfs program reads the information in the /etc/exports configuration file. The syntax for this command varies, depending on what actions you want to perform: /usr/sbin/exportfs [-auv] /usr/sbin/exportfs [-uv] [dir ...] /usr/sbin/exportfs -i [-o options ] [-v] [dir ...] The parameters and options you can use with this command are listed here:
The options you can specify after the -o qualifier are the same as you use in the /etc/exports file (see the following section, "Configuration Files"). To export or unexport (stop sharing) all entries found in the /etc/exports file, use the -a or -u option. This is probably the most- often-used form because you can specify the other options you need on a per-directory basis in the /etc/exports file. This example causes all directories listed in /etc/exports to be available for use by remote clients: exportfs -a The following example causes your NFS server to stop sharing all the directories listed for export in the /etc/exports file: exportfs -au The second form can be used to export or unexport (stop exporting) a particular directory (or directories) instead of all directories. You specify the directories on the command line. You can use this form if you want to stop sharing a particular directory because of system problems or maintenance, for example. Using the following syntax causes the NFS server to stop sharing the /etc/user/accounting directory with remote users: exportfs -u /etc/users/accounting The next form of the command can be used to ignore the options found in the /etc/exports file. Instead, you can supply them (using the -o parameter) on the command line. You will probably use this in special cases because you could just as easily change the options in the /etc/exports file if the change were a permanent one. If, for example, you decided that you wanted to make an exported directory that is currently set to be read-write to be read-only, you could use the following command: exportfs -o ro /etc/users/purch You can also dismount and mount remote file systems using different options when troubleshooting or when researching the commands you will need when preparing to upgrade a network segment where connections need to change. If changes are made to the /etc/exports file while the system is running, use the exportfs command (with the -a parameter) to make the changes take effect. To get a list of directories that are currently being exported, you can execute the command with no options, and it will show you a list. Of course, it is not necessarily a good idea to make changes on-the-fly without keeping track of the connections. When you decide to perform online testing to mount or dismount file systems, be sure that you are not going to impact any users who are currently making productive use of the resources. To make testing more foolproof and to provide a quick back-out procedure, try copying the /etc/exports file to keep a safe starting copy and making changes to the copied file, loading it by using the exportfs -a command. When you determine that something has been done incorrectly, you can simply use the backup copy of the file you have made to restore the status quo. Configuration Files
To make a file system or a directory in a file system available for export, add the pathnames to the /etc/exports file. The format for an entry in this file is as follows: directory [-option, ...] The term directory is a pathname for the directory you want to share with other systems. The options you can include are the following:
For example: /etc/users/acctpay -access=acct /etc/users/docs -ro /etc/users/reports/monthend -rw=ono In this file, the first directory, /etc/users/acctpay , which stores accounts payable files, will be shared with a group called acct the accounting department. The /docs directory can be accessed by anyone in read-only mode. The /reports/monthend directory can be accessed in read-only mode by most users, but users on the computer whose hostname is ono will have read-write access.
Automounting File Systems
The Mount protocol takes care of the details of making a connection for the NFS client to the NFS server. This means that it is necessary to use the mount command to make the remote file system available at a mountpoint in the local file system. To make this process even easier, the automountd daemon has been created. This daemon listens for NFS requests and mounts a remote file system locally on an as-needed basis. The mounted condition usually persists for a specified number of minutes (the default is usually five minutes) in order to satisfy any further requests. As with other daemons, the automountd daemon is started at boot time in the /etc/rc.local file. You can enter it as a command after the system is up and running, if needed. When a client computer tries to access a file that is referenced in an automount map, the automountd daemon checks to see whether the file system for that directory is currently mounted. The daemon temporarily mounts the file system so that the user's request can be fulfilled, if needed. The automount map is a file that tells the daemon where the file system to be mounted is located and where it should be mounted in the local file system. Options can also be included for the mount process, for example, to make it is read-write or read-only. The automountd daemon mounts a file system under the mountpoint /tmp_mnt . It then creates a symbolic link that appears to the user as part of his file system. Mounting File Systems Using the automount Command
The /etc/rc.local file usually contains the command used to start the automountd daemon. This daemon is responsible for processing NFS mount requests as they are defined in special files called map files . The syntax for the automount command is as follows: automount [-mnTv] [-D name = value ] [-f master-file ] [-M mount-directory ] [-tl duration ] [-tm interval ] [-tw interval ][directory mapname [- mount-options ]] The options you can use are the following:
Master Maps
The automount daemon uses the master map to obtain a list of maps. The master map also contains mount options for those maps. The master map file is usually named /etc/auto.master . The syntax for the entries in this file is as follows: mount-point map [ mount-options ] mount-point is the pathname of the local directory for an indirect map specified in the map field. If the map specified in the map column is a direct map, the mountpoint is usually /- . The data listed under the map field is used to find the map that contains the actual mountpoints and the locations of the remote file systems. Any data you supply for mount-options will be used when mounting directories in the map file associated with it. Following is an example of a master map file (lines that begin with # are comments): #mount-point map options /etc/users /etc/auto.usr -ro /- /etc/auto.direct -rw When the automount daemon determines that access is needed for files found in the /etc/users directory, it will look for another map file, named auto.usr , to get the rest of the information. The -ro options are specified for this entry and will be applied to the file system designated in the auto.usr map file. The argument /- is used to specify that a map file it points to, in this case auto.direct , is a direct map file or one that contains the mountpoints and the remote file-system information needed to complete the mounts. Direct Maps
The remote file systems can be mounted into the local file system, and the mountpoint should be information you will find in a direct map. The construction of this file is very direct. The syntax for an entry is as follows: key [ mount-options ] location The key field is the mountpoint to be used for this entry. mount-options are the options used with the mountd daemon discussed earlier in this chapter. The location field should be in the format of machine:pathname , where machine is the hostname of the remote system that the file system actually resides on and pathname is the path to the directory on that file system. You can specify multiple locations to provide for redundancy. The automount daemon queries all locations in this case and takes the first one to respond to its requests. Indirect Maps
In an indirect map file, most fields are the same as in a direct map file, except that the first field ( key ) is not a full pathname. It is a pointer to an entry in the master map file. You can list multiple directories in an indirect map file, and each of these remote file-system directories will be mounted under the mountpoint designated in the master map file that contains a reference to the indirect map. Check the man pages on your system to be sure of the syntax for options used in map files because they might vary just like options do for the mount command among different Unix systems. Troubleshooting NFS Problems
Many of the TCP/IP utilities that are used for troubleshooting can be employed when trying to diagnose and fix problems having to do with NFS. For example, if a remote file system suddenly becomes unavailable, it only makes sense to first determine whether the remote server is still functioning. You can do this quickly by using the ping command to establish basic network connectivity. A failure to communicate using this small utility indicates that there is a server problem at the other end or perhaps a network malfunction that is preventing communications with the remote system. When you're troubleshooting, this tells you that the problem is most likely not one to be found in the NFS subsystem.
You can find detailed information about using various TCP/IP utilities for troubleshooting purposes in Chapter 28, "Troubleshooting Tools for TCP/IP Networks."
The tracert utility also can be used when ping fails to determine how far along the network route the packet is getting on its trip to the remote system. Use this when trying to isolate the particular point of failure in the network. There is a useful command specific to NFS that can be used to display statistical information about NFS. It is nfsstat . This command shows you statistics about NFS and RPC. The syntax for nfsstat is as follows: nfsstat [-cnrsz] [vmunix.n] [core.n] These are the options you can use:
All statistics are shown if you do not supply any parameters when executing the command. The statistical data that will be displayed depends on the options you choose. For an example of the detailed data you can obtain using this command, see the man page for nfsstat for your particular Unix or Linux system. Examining the output from the nfsstat command can be useful on an ongoing basis to help you establish a baseline for performance evaluations you will need to make later when thinking about upgrading. You can easily selectively store data output by this command in a text file or spreadsheet. You can also create a simple script file that can be used to gather statistics using this command on a periodic basis, storing the results in a temporary directory for your later review. For example, the command nfsstat -s displays statistics for the NFS server as shown here: # nfsstat -s Server RPC: calls badcalls nullrecv badlen 23951 0 0 0 Server NFS: calls badcalls 23164 0 null getattr setattr root lookup readlink 1 0% 64 0% 0 0% 0 0% 121 0% 0 0% read wrcache write create remove rename 22951 99% 0 0% 0 0% 0 0% 0 0% 0 0% link symlink mkdir rmdir readdir fsstat 0 0% 0 0% 0 0% 0 0% 25 0% 2 0% In this display you can see statistics for the total number of remote procedure calls, along with information about those RPC calls that relate to NFS. In addition to the total number of calls, you can see statistics concerning the following items for RPC:
In addition, some implementations might show additional RPC fields. For the NFS server, there are many columns of information displayed, showing you the number of reads and writes , along with other useful information. For example, you can examine cache usage ( wrcache ), or determine when other file commands are used to create or remove directories. If the number of badcalls begins to become significant when compared to the overall number of calls, a problem obviously exists. If the value displayed for badlen is consistently a higher percentage of the overall number of calls, a client might be incorrectly configured or a network problem might be causing packets to become corrupted. Again, you may see different or additional fields of information in the display, depending on the Unix/Linux and NFS implementation you are using. A careful review of the documentation for your system will give you a good idea of the performance to be expected from your server and the kinds of events to look for. |